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Abstract. We present a static analysis by Abstract Interpretation to check for run-time 
errors in parallel and multi-threaded C programs. Following our work on Astree, we focus 
on embedded critical programs without recursion nor dynamic memory allocation, but 
extend the analysis to a static set of threads communicating implicitly through a shared 
memory and explicitly using a finite set of mutual exclusion locks, and scheduled according 
to a real-time scheduling policy and fixed priorities. Our method is thread-modular. It is 
based on a slightly modified non-parallel analysis that, when analyzing a thread, applies 
and enriches an abstract set of thread interferences. An iterator then re-analyzes each 
thread in turn until interferences stabilize. We prove the soundness of our method with 
respect to the sequential consistency semantics, but also with respect to a reasonable 
weakly consistent memory semantics. We also show how to take into account mutual 
exclusion and thread priorities through a partitioning over an abstraction of the scheduler 
state. We present preliminary experimental results analyzing an industrial program with 
our prototype, Thesee, and demonstrate the scalability of our approach. 



1. Introduction 

Ensuring the safety of critical embedded software is important as a single "bug" can have 
catastrophic consequences. Previous work on the Astree analyzer [8] demonstrated that 
static analysis by Abstract Interpretation could help, when specializing an analyzer to a 
class of properties and programs — namely in that case, the absence of run-time errors (such 
as arithmetic and memory errors) on synchronous control / command embedded avionic 
C software. In this article, we describe ongoing work to achieve similar results for multi- 
threaded and parallel embedded C software. Such an extension is demanded by the current 
trend in critical embedded systems to switch from large numbers of single-program proces- 
sors communicating through a common bus to single-processor multi-threaded applications 
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comniunicating through a shared memory — for instance, in the context of Integrated Mod- 
ular Avionics [60j. Analyzing each thread independently with a tool such as Astree would 
not be sound and could miss bugs that only appear when threads interact. In this article, 
we focus on detecting the same kinds of run-time errors as Astree does, while taking thread 
communications into account in a sound way, including accesses to the shared memory 
and synchronization primitives. In particular, we correctly handle the effect of concurrent 
threads accessing a common variable without enforcing mutual exclusion by synchronization 
primitives, and we report such accesses — these will be called data-races in the rest of the 
article. However, we ignore other concurrency hazards such as dead-locks, live-locks, and 
priority inversions, which are considered to be orthogonal issues. 

Our method is based on Abstract Interpretation |13) , a general theory of the approx- 
imation of semantics which allows designing static analyzers that are fully automatic and 
sound by construction — i.e., consider a superset of all program behaviors. Such analyzers 
cannot miss any bug in the class of errors they analyze. However, they can cause spurious 
alarms due to over-approximations, an unfortunate effect we wish to minimize while keeping 
the analysis efficient. 

To achieve scalability, our method is thread-modular and performs a rely-guarantee 
reasoning, where rely and guarantee conditions are inferred automatically. At its core, it 
performs a sequential analysis of each thread considering an abstraction of the effects of 
the other threads, called interferences. Each sequential analysis also collects a new set of 
interferences generated by the analyzed thread. It then serves as input when analyzing 
the other threads. Starting from an empty set of interferences, threads are re-analyzed in 
sequence until a fixpoint of interferences is reached for all threads. Using this scheme, few 
modifications are required to a sequential analyzer in order to analyze multi-threaded pro- 
grams. Practical experiments suggest that few thread re-analyses are required in practice, 
resulting in a scalable analysis. The interferences are considered in a flow- insensitive and 
non-relational way: they store, for each variable, an abstraction of the set of all values it 
can hold at any program point of a given thread. Our method is however quite generic 
in the way individual threads are analyzed. They can be analyzed in a fully or partially 
flow-sensitive, context-sensitive, path-sensitive, and relational way (as is the case in our 
prototype) . 

As we target embedded software, we can safely assume that there is no recursion, dy- 
namic allocation of memory, nor dynamic creation of threads nor locks, which makes the 
analysis easier. In return, we handle two subtle points. Firstly, we consider a weakly consis- 
tent memory model: memory accesses not protected by mutual exclusion (i.e., data-races) 
may cause behaviors that are not the result of any thread interleaving to appear. The 
reason is that arbitrary observation by concurrent threads can expose compiler and pro- 
cessor optimizations (such as instruction reordering) that are designed to be transparent 
on non-parallel programs only. We prove that our semantics is invariant by large classes 
of widespread program transformations, so that an analysis of the original program is also 
sound with respect to reasonably compiled and optimized versions. Secondly, we show how 
to take into account the effect of a real-time scheduler that schedules the threads on a 
single processor following strict, fixed priorities. According to this scheduling algorithm, 
which is quite common in the realm of embedded real-time software — e.g., in the real-time 
thread extension of the POSIX standard [34], or in the ARINC 653 avionic operating sys- 
tem standard [3] — only the unblocked thread of highest priority may run. This ensures 
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some lock-less mutual exclusion properties that are actually exploited in real-time embed- 
ded programs and relied on for their correctness (this includes the industrial application 
our prototype currently targets). We show how our analysis can take these properties into 
account, but we also present an analysis that assumes less properties on the scheduler and is 
thus sound for true multi-processors and non-real-time schedulers. We handle synchroniza- 
tion properties (enforced by either locks or priorities) through a partitioning with respect 
to an abstraction of the global scheduling state. The partitioning recovers some kind of 
inter-thread flow-sensitivity that would otherwise be completely abstracted away by the 
interference abstraction. 

The approach presented in this article has been implemented and used at the core of 
a prototype analyzer named Thesee. It leverages the static analysis techniques developed 
in Astree [^ for single-threaded programs, and adds the support for multiple threads. We 
used Thesee to analyze in 27 h a large (1.7 M lines) multi-threaded industrial embedded C 
avionic application, which illustrates the scalability of our approach. 

Organisation. Our article is organized as follows. First, Sec. [2] presents a classic non- 
parallel semantics and its static analysis. Then, Sec. |3] extends them to several threads in 
a shared memory and discusses weakly consistent memory issues. A model of the scheduler 
and support for locks and priorities are introduced in Sec. |4j Our prototype analyzer, 
Thesee, is presented in Sec. [5} as well as some experimental results. Finally, Sec. [6] discusses 
related work, and Sec. [7] concludes and envisions future work. 

This article defines many semantics. They are summarized in Fig. [T| using C to denote 
the "is less abstract than" relation. We alternate between two kinds of concrete semantics: 
semantics based on control paths (P^r, P*, P-^), that can model precisely thread interleavings 
and are also useful to characterize weakly consistent memory models (P^, P^), and seman- 
tics by structural induction on the syntax (P, Pj, Pc), that give rise to effective abstract 
interpreters (P% Pj, P^). Each semantics is presented in its subsection and adds some fea- 
tures to the previous ones, so that the final abstract analysis P^ presented in Sec. 
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should 

hopefully not appear as too complex nor artificial, but rather as the logical conclusion of a 
step-by-step construction. 

Our analysis has been mentioned first, briefly and informally, in [TJ § VI]. We offer here 
a formal, rigorous treatment by presenting all the semantics fully formally, albeit on an 
idealised language, and by studying their relationship. The present article is an extended 
version of ^47j and includes a more comprehensive description of the semantics as well as 
the proof of all theorems, that were omitted in the conference proceedings due to lack of 
space. 

Notations. In this article, we use the theory of complete lattices, denoting their partial 
order, join, and least element respectively as C, U, and _L, possibly with some subscript 
to indicate which lattice is considered. All the lattices we use are actually constructed 
by taking the Cartesian product of one or several powerset lattices — i.e., 7^(5") for some 
set S — C, U, and _L are then respectively the set inclusion C, the set union U, and the 
empty set 0, applied independently to each component. Given a monotonic operator F 
in a complete lattice, we denote by IfpF its least fixpoint — i.e., F{lfpF) = IfpF and 
yX : F{X) = X ^ Ifp F Q X — which exists according to Tarski [Ml [II!- We denote 

hy A ^ B the set of functions from a set ^ to a set S, and by A — > B the set of 
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Figure 1: Semantics defined in the article. 

complete U— morphisms from a complete lattice j4 to a complete lattice i?, i.e., such that 
F{\Aa ^^ = \_\b { P{x) \ X ^ X} for any finite or infinite set X C. A. Additionally, such a 
function is monotonic. We use the theory of Abstract Interpretation by Cousot and Cousot 
and, more precisely, its concretization-based (7) formalization [IB]. We use widenings (v) to 
ensure termination [T7j. The abstract version of a domain, operator, or function is denoted 
with a d superscript. We use the lambda notation \x : f{x) to denote functions. If / is a 
function, then f[x 1— )• v] is the function with the same domain as / that maps x to v, and 
all other elements y ^ x to f{y). Likewise, /[Vx G X : x 1— t- g{x)] denotes the function 
that maps any x G X to g{x), and other elements y ^ X to f{y). Boldface fonts are 
used for syntactic elements, such as "■while" in Fig. [2| Pairs and tuples are bracketed by 
parentheses, as in X = (A, i?,C), and can be deconstructed (matched) with the notation 
"let [A, — ,C) = X in • • •" where the "— " symbol denotes irrelevant tuple elements. The 
notation "let Mx ^ X : y^ = ■ ■ ■ in • • • " is used to bind a collection of variables {yx)x&x 
at once. Semantic functions are denoted with double brackets, as in X[[y]], where y is 
an (optional) syntactic object, and X denotes the kind of objects (S for statements, E 
for expressions, P for programs, V\ for control paths). The kind of semantics considered 
(parallel, non-parallel, abstract, etc.) is denoted by subscripts and superscripts over X, 
as exemplified in Fig. [T| Finally, we use finite words over arbitrary sets, using e and • 
to denote, respectively, the empty word and word concatenation. The concatenation • is 
naturally extended to sets of words: A ■ B = {a-b\a£A, b£B}. 



2. Non-parallel Programs 

This section recalls a classic static analysis by Abstract Interpretation of the run-time 
errors of non-parallel programs, as performed for instance by Astree [8]. The formalization 
introduced here will be extended later to parallel programs, and it will be apparent that an 
analyzer for parallel programs can be constructed by extending an analyzer for non-parallel 
programs with few changes. 
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Figure 2: Syntax of programs. 

2.1. Syntax. For the sake of exposition, we reason on a vastly simpHfied programming lan- 
guage. However, the results extend naturally to a realistic language, such as the subset of C 
excluding recursion and dynamic memory allocation considered in our practical experiments 
(Sec. pi). We assume a fixed, finite set of variable names V. A program is a single struc- 
tured statement, denoted body € stat. The syntax of statements stat and of expressions 
expr is depicted in Fig. I2| Constants are actually constant intervals [ci,C2], which return a 
new arbitrary value between ci and C2 every time the expression is evaluated. This allows 
modeling non-deterministic expressions, such as inputs from the environment, or stubs for 
expressions that need not be handled precisely, e.g., sin(x) could be replaced with [—1,1]. 
Each unary and binary operator 0£ is tagged with a syntactic location i G C and we denote 
by C the finite set of all syntactic locations. The output of an analyzer will be the set of 
locations i with errors — or rather, a superset of them, due to approximations. 

For the sake of simplicity, we do not handle procedures. These are handled by inlining 
in our prototype. We also focus on a single data-type (real numbers in R) and numeric 
expressions, which are sufficient to provide interesting properties to express, e.g., variable 
bounds, although in the following we will only discuss proving the absence of division by 
zero. Handling of realistic data-types (machine integers, floats arrays, structures, pointers, 
etc.) and more complex properties (such as the absence of numeric and pointer overflow) as 
done in our prototype is orthogonal, and existing methods apply directly — for instance [7]. 

2.2. Concrete Structured Semantics P. As usual in Abstract Interpretation, we start 
by providing a concrete semantics, that is, the most precise mathematical expression of 
program semantics we consider. It should be able to express the properties of interest to us, 
i.e., which run-time errors can occur — only divisions by zero for the simplified language 
of Fig. [2] For this, it is sufficient that our concrete semantics tracks numerical invariants. 
As this problem is undecidable, it will be abstracted in the next section to obtain a sound 
static analysis. 

A program environment p €z £ maps each variable to a value, i.e., £ = V — )• R. The 
semantics E|{ e ]] of an expression e G expr takes as input a single environment p, and outputs 
a set of values, in 'P(R), and a set of locations of run-time errors, in V{C). It is defined by 
structural induction in Fig. |3j Note that an expression can evaluate to one value, several 
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nei:£^{V{R)xV{C)) 



dcf 



nx}p = {{p{x)], 



dcf 



Enci,C2]lp = ({cGR|ci<c<C2}, 0) 



dct 



E[ -^ el/9 = let (y,f^) = Ejel/oin ({ -x | xeV], Vt) 



l\eiOie2ip = 

\ei{Vi,ili)= E[eilpin 
let (^2, 5^2)= E[e2lpm 

({xiOX2 I Xl G Vl, X2 G V2, O / / VX2 / 0}, 

f^iUf72U{^|o = /A0G ^2}) 
where o G { +,-, x,/} 

Figure 3: Concrete semantics of expressions. 

values (due to non-determinism in [ci,C2]) or no value at all (in the case of a division by 
zero) . 

To define the semantics of statements, we consider as semantic domain the complete 
lattice: 

V = V{£) X V{C) (2.1) 

with partial order C defined as the pairwise set inclusion: {A,B) C [A'^B') ■^=^ A C 
A' /\ B CI B' . We denote by U the associated join, i.e., pairwise set union. The structured 
semantics SJs]] of a statement s is a morphism in T> that, given a set of environments R 
and errors Vl before a statement s, returns the reachable environments after s, as well as il 
enriched with the errors encountered during the execution of s. It is defined by structural 
induction in Fig. El We introduce the new statements e ix] 0? (where ixi G {=, 7^, <,>,<,> } 
is a comparison operator) which we call "guards." These statements do not appear stand- 
alone in programs, but are useful to factor the semantic definition of conditionals and loops 
(they are similar to the guards used in Dijkstra's Guarded Commands [25]). Guards will also 



prove useful to define control paths in Sec. 2.4 Guards filter their argument and keep only 



those environments where the expression e evaluates to a set containing a value v satisfying 
u M 0. The symbol 1^ denotes the negation of ixi, i.e., the negation of =, /, <, >, <, > is, 
respectively, /, =, >, <, >, <. Finally, the semantics of loops computes a loop invariant 
using the least fixpoint operator Ifp. The fact that such fixpoints exist, and the related fact 
that the semantic functions are complete U— morphisms, i.e., SJsIKUjg/Xj) = Ujg/ SJsjXj, 
is stated in the following theorem: 

Theorem 2.1. Vs G stat : SfsJ is well defined and a complete \J— morphism. 

Proof. In Appendix |A.l D 



We can now define the concrete structured semantics of the program as follows: 

P = n, where {-,Q) = S[ body j{£o,^) (2.2) 

where ifo ^ <? is a set of initial environments. We can choose, for instance, Sq = £ 01 

"^0 = { '^^ G V : } . Note that all run-time errors are collected while traversing the 
program structure; they are never discarded and all of them eventually reach the end of 
body, and so, appear in P, even if S[[ body }{£o, 0) outputs an empty set of environments. Our 
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SLsi -.V^V 
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dof 



si;s2l(i2,f^) = (S[s2loS[sil)(i?,f]) 



dcf 



S[if eMOthensK^,^^) = (S[sl o S[e M 0?l)(i?, !^) U S[e [^ 0?K^, S^) 



dot 



SJwhileeMOdosKi?,^) = S[e^O?l {Ifp \X : (i?, Jl) U (S[sl o S[e M 0?1)X) 
where IX G {=,/,<,>,<, >} 

Figure 4: Structured concrete semantics of statements. 

program semantics thus observes the set of run-time errors that can appear in any execution 
starting at the beginning of body in an initial environment. This includes errors occurring in 
executions that loop forever (such as infinite reactive loops in control / command software) 
or that halt before the end of body. 

2.3. Abstract Structured Semantics P". The semantics P is not computable as it in- 
volves least fixpoints in an infinite-height domain P, and not all elements in P are repre- 
sentable in a computer as P is uncountable. Even if we restricted variable values to a more 
realistic, large but finite, subset — such as machine integers or floats — naive computation 
in P would be unpractical. An effective analysis will instead compute an abstract semantics 
over-approximating the concrete one. 

The abstract semantics is parametrized by the choice of an abstract domain of en- 
vironments obeying the signature presented in Fig. 5l It comprises a set S** of computer- 

'— ' Li 

representable abstract environments, with a partial order C^ (denoting abstract entailment) 
and an abstract environment S^ G £* representing initial environments. Each abstract en- 
vironment represents a set of concrete environments through a monotonic concretization 
function 7^: : f^" — )• V{£). We also require an effective abstract version U^- of the set union 
U, as well as effective abstract versions S'[[ s ]] of the semantic operators S[[ s ]] for assignment 
and guard statements. Only environment sets are abstracted, while error sets are repre- 
sented explicitly, so that the actual abstract semantic domain for S^^Js]] is P" = £^ x V{C), 
with concretization 7 defined in Fig. [5] Figure [5] also presents the soundness conditions that 
state that an abstract operator outputs a superset of the environments and error locations 
returned by its concrete version. Finally, when £"" has infinite strictly increasing chains, we 
require a widening operator V^-, i.e., a sound abstraction of the join U with a termination 
guarantee to ensure the convergence of abstract fixpoint computations in finite time. There 
exist many abstract domains £% for instance the interval domain [13j, where an abstract 
environment in iS" associates an interval to each variable, the octagon domain |l6], where 
an abstract environment in S'^ is a conjunction of constraints of the form it A ±Y < c with 
A, y G V, c G P, or the polyhedra domain [21j, where an abstract environment in S'^ is a 
convex, closed (possibly unbounded) polyhedron. 

In the following, we will refer to assignments and guards collectively as primitive state- 
ments. Their abstract semantics S'Js]] in P" depends on the choice of abstract domain; 
we assume it is provided as part of the abstract domain definition and do not discuss it. 
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(set of abstract environments) 

(concretization) 

(empty abstract environment) 

(initial abstract environment) 

(abstract entailment) 

uj. : {£^ X £i) -^ £i (abstract join) 

s.t. jeiXi ul y«) D 7£(X«) u 7^(y«) 

V£- : (.?» X f t!) ^ ft) (widening) 

s.t. 7£(X« Vf y») D 7f (^") U 7^(^*) 

and V(y/)i6N : the sequence X^ = yj, xf^^ = xf V^ y^i 
reaches a fixpoint Xf^ = Xj^,^ for some /c G N 

ptt =^ £:» X r{C) (abstraction ofV) 

7 : P" — )• D (concretization for T)^) 

s.t. VsejX^e, cmO?} : (S[sl o 7)(i?fl, rj) C (7 o Stt[sl)(i?«, !^) 
Figure 5: Abstract domain signature, and soundness and termination conditions. 

By contrast, the semantics of non-primitive statements can be derived in a generic way, 
as presented in Fig. [6} Note the similarity between these definitions and the concrete 
semantics of Fig. [4j except for the semantics of loops that uses additionally a widening 
operator V derived from V^-. The termination guarantee of the widening ensures that, 
given any (not necessarily monotonic) function F" : 2?" — )• V*, the sequence Xq = (J.^-,©), 
^i+i = ^i V F^{Xf) reaches a fixpoint Xf, = Xf,,-^ in finite time /c G N. We denote this 
limit by Urn AX' : X" V F'^{X'^). Note that, due to widening, the semantics of a loop is gen- 
erally not a join morphism, and even not monotonic [T7], even if the semantics of the loop 
body is. Hence, there would be little benefit in imposing that the semantics of primitive 
statements provided with P' is monotonic, and we do not impose it in Fig. [5J Note also 
that lim F" may not be the least fixpoint of F' (in fact, such a least fixpoint may not even 
exist). 



The abstract semantics of a program can then be defined, similarly to (2.2), as: 

P« = n, where (-,0) = S«[ body 1(^,0) . 
The following theorem states the soundness of the abstract semantics: 
Theorem 2.2. P C P«. 
Proof. In Appendix |A. 2 D 
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sv,S2}{RK^) = (Stt[s2loStt[sil)(i?tt,[7) 
if eMOthensl(i?«,J]) =' 

whileeM0dosl(fl«,f7) = 

S«[e^O?l(/imAX« :X« V((i?«,fi) U^ (Sttjsl o S«[e M 0?1)X«)) 
where: 

{R\,ni) utt {r\,^2) = {R\ij{R\, OiUf^s) 
{R{,^i) V {r\, ^2) = {r\ Ve RI ^1 U ^2) 

Figure 6: Derived abstract functions for non-primitive statements. 

The resulting analysis is flow-sensitive. It is relational whenever £^ is — e.g., with 
octagons [36] . The iterator follows, in the terminology of [10] , a recursive iteration strategy. 
The advantage of this strategy is its efficient use of memory: few abstract elements need to 
be kept in memory during the analysis. Indeed, apart from the current abstract environ- 
ment, a clever implementation of Fig. [6] exploiting tail recursion would only need to keep 
one extra environment per if e ixi then s statement — to remember the (i?", O) argument 
while evaluating s — and two environments per while e ixi do s statement — one for 
{R% Q) and one for the accumulator X" — in the call stack of the abstract interpreter func- 
tion S". Thus, the maximum memory consumption is a function of the maximum nesting 
of conditionals and loops in the analyzed program, which is generally low. This efficiency 
is key to analyze large programs, as demonstrated by Astree ^. 



2.4. Concrete Path-Based Semantics P^^. The structured semantics of Sec. 2.2 is de 



fined as an interpretation of the program by induction on its syntactic structure, which can 



be conveniently transformed into a static analyzer, as shown in Sec. 2.3 Unfortunately, 



the execution of a parallel program does not follow such a simple syntactic structure; it is 



rather defined as an interleaving of control paths from distinct threads (Sec. 3.1). Before 
considering parallel programs, we start by proposing in this section an alternate concrete 
semantics of non-parallel programs based on control paths. While its definition is different 



from the structured semantics of Sec. |2.2[ its output is equivalent. 

A control path p is any finite sequence of primitive statements, among AT -^ e, e ixi 0?. 
We denote by 11 the set of all control paths. Given a statement s, the set of control paths 
it spawns 7r(s) C n is defined by structural induction as follows: 

^(X^e) = {X ^e} 

7r(si; S2) = vr(si) • 7r(s2) ^2 gN 

7r(if e M then s) = ({ e cxi 0? } • 7r(s)) U { e i^ 0? } 

7r(while e ex do s) = {Ifp AA : {e} U (A • { e 00 0? } • -k{s))) • { e i^ 0? } 

where e denotes then empty path, and • denotes path concatenation, naturally extended 
to sets of paths. A primitive statement spawns a singleton path of length one, while a 
conditional spawns two sets of paths — one set where the then branch is taken, and one 
where it is not taken — and loops spawn an infinite number of paths — corresponding to all 



10 ANTOINE MINE 



possible unrollings. Although tt{s) is infinite whenever s contains a loop, it is possible that 
many control paths in 7r(s) are actually infeasible, i.e., have no corresponding execution. In 
particular, even if a loop s is always bounded, tt{s) contains unrollings of arbitrary length. 
We can now define the semantics ITIJP]] £ D — > 2? of a set of paths P C 11 as follows, 
reusing the semantics of primitive statements from Fig. |4] and the pairwise join U on sets of 
environments and errors: 

niPj{R,n) "^^ [_\{{sisnjo...osisij){R,n)\si-...-sneP} . (2.4) 

The path-based semantics of a program is then: 

P^ = Q, wheie {-,Q) = Ul7r (body) }{So,^) . (2.5) 

Note that this semantics is similar to the standard meet over all paths solutiorj^ of data-flow 
problems — see, e.g., |48i § 2] — but for concrete executions in the infinite- height lattice T>. 
The meet over all paths and maximum fixpoint solutions of data-flow problems are equal 
for distributive frameworks; similarly, our structured and path-based concrete semantics 
(based on complete U— morphisms) are equal: 

Theorem 2.3. Vs G stat : n[[7r(s)]] = S[s]]. 

Proof. In Appendix |A. 3 D 



An immediate consequence of this theorem is that P = Pvr, hence the two semantics compute, 
in different ways, the same set of errors. 

3. Parallel Programs in a Shared Memory 

In this section, we consider several threads that communicate through a shared memory, 
without any synchronization primitive yet — they will be introduced in Sec. |4j We also 
discuss here the memory consistency model, and its effect on the semantics and the static 
analysis. 

A program has now a fixed, finite set T of threads. To each thread t G 7" is associated 
a statement body body^ £ stat. All the variables in V are shared and can be accessed by all 
threads. 

3.1. Concrete Interleaving Semantics P*. The simplest and most natural model of 
parallel program execution considers all possible interleavings of control paths from all 
threads. These correspond to sequentially consistent executions, as coined by Lamport |39j . 

A parallel control path p is a finite sequence of pairs (s,t), where s is a primitive 
statement (assignment or guard) and t S T is a thread that executes it. We denote by 11* 
the set of all parallel control paths. The semantics PI^JP]] € T> — > 2? of a set of parallel 



control paths P C 11* is defined as in the case of regular control paths (2.4), ignoring thread 
identifiers: 

^4Pj{R,n) "^^ |J{(S[s„lo...oS[sil)(P,f7)|(si,-)-...-(s„,-)GP} . (3.1) 



The lattices used in data-flow analysis and in abstract interpretation are dual: the former use a meet to 
join paths — hence the expression "meet over all paths" — while we employ a join U. Likewise, the greatest 
fixpoint solution of a data-flow analysis corresponds to our least fixpoint. 
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We now denote by n^: C H* the set of parallel control paths spawned by the whole 
program. It is defined as: 

7r=K = {p G n* I Vt G T : proj^{p) G nobody f) } (3.2) 



where the set n^body^) of regular control paths of a thread is defined in (2.3), and projf{p) 
extracts the maximal sub-path of p on thread t as follows: 

ProJt{{si,ti) ■ ...■ {Sn,tn)) = Si^- ...■ Si^ 

such that Vj : 1 < ij < ij+i < n A ti.=tA 

\/k : {k < iiM k > imy ij < k < ij+i) =^ t^ ^ t . 



The semantics P* of a parallel program becomes, similarly to (2.5): 



P, = 0, where (-,1^) = ^4^, 1(^0,0) (3.3) 

i.e., we collect the errors that can appear in any interleaved execution of all the threads, 
starting in an initial environment. 

Because we interleave primitive statements, a thread can only interrupt another one 
between two primitive statements, and not in the middle of a primitive statement. For 
instance, in a statement such as X .(— y + y, no thread can interrupt the current one and 
change the value of Y between the evaluation of the first and the second Y sub-expression, 
while it can if the assignment is split into X'l^Y^X'i^X + Y. Primitive statements are 
thus atomic in P,,,. By contrast, we will present a semantics where primitive statements are 
not atomic in Sec. 13.41 



3.2. Concrete Interference Semantics Pj. Because it reasons on infinite sets of paths, 
the concrete interleaving semantics from the previous section is not easily amenable to 
abstraction. In particular, replacing the concrete domain D in ¥^ with an abstract one P" 



(as defined in Sec. 2.3) is not sufficient to obtain an effective and efficient static analyzer as 
we still have a large or infinite number of paths to analyze separately and join. By contrast, 
we propose here a (more abstract) concrete semantics that can be expressed by induction 



on the syntax. It will lead naturally, after further abstraction in Sec. 3.3, to an effective 
static analysis. 

3.2.1. Thread semantics. We start by enriching the non-parallel structured semantics of 



Sec. 2.2 with a notion of interference. We call interference a triple (t, X, v) G X, where 



dcf 

X = T X V X P, indicating that the thread t can set the variable X to the value v. 
However, it does not say at which program point the assignment is performed, so, it is a 
flow-insensitive information. 

The new semantics of an expression e, denoted Ejje]], takes as argument the current 
thread t G 7~ and an interference set / C X in addition to an environment p G £". It is 
defined in Fig. u\ The main change with respect to the interference- free semantics E[[e]] 
of Fig. ^ is that, when fetching a variable X G V, each interference {t',v,X) G / on X 
from any other thread t' 7^ t is applied. The semantics of constants and operators is not 
changed, apart from propagating t and / recursively. Note that the choice of evaluating 
EjH X J (t, yO, I) to p{X) or to some interference in /, as well as the choice of the interference 
in /, is non-deterministic. Thus, distinct occurrences of the same variable in an expression 
may evaluate, in the same environment, to different values. 
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Ej[el : (r X g X V{Z)) ^ {V{R) x P(£)) 

dcf 



Ex[Xl(t,p,/) °=^' ({p(X)}U{H3t'/i:(i',^,t')e/},0) 



dcf 



ExI [Cl, C2] l(t, /?,/) = ({ C G R I CI < C < C2 }, 



dot 



m -I e}{t,p,I) = \et{V,n)= Exlej{t,p,I)m{{-x\xeV},n) 

ExleiOee2J{t,p,I) = 

let (^1,^1)= Exleij{t,p,I)m 

let (y2,f^2)= Ex 1 62 Hi, P,^) in 

{{xi oa;2 I xi G Vi, X2 G V2, o / / V X2 / 0}, 

J7iUfi2U{£|o = /A0Gy2}) 
where o G {+, -, x,/} 

Figure 7: Concrete semantics of expressions with interference. 

Sxls,tj:Vx^Vx 

SxlX^e,tj{R,n,I) "=' 

(0,0,/) Ux Ux let(F,S^')= Exlej{t,p,I)m 

p^R ({ p[X ^ w] I w G y }, 0', {{t,X,v)\v£V }) 

SxIcmO?, tK^,^,-^) = 

(0,17,/) Ux Ux let(F,f]')= Ex[eHi,P,/)m 
pe-R ({/)| BvGF : vmO}, 17', 0) 

Sxjif cmO then s, t ]](/?, f],/) =^ 

(Sx[s,tl o Sx[e M 0?, tl)(/?,f7,/) Ux Sx[e ^ 0?, tK^,^,^) 

SxJwhileeDxiOdos, tK^,^,^) =^ 

SxJe^O?, tK^/pAX:(/?,f],/)Ux(Sx[s,tloSx[eMO?, tl)X) 

Sx[si;s2, tl(/?,f^,/) = (Sx[s2,tloSx[si,tl)(/?,f^,/) 

Figure 8: Concrete semantics of statements with interference. 

The semantics of a statement s executed by a thread t G T is denoted Sx[[s,t]]. It is 
presented in Fig. |8j This semantics is enriched with interferences and is thus a complete 
Ux— morphism in the complete lattice: 

Vx = r{£) X V{C) X r{i) 

where the join Ux is the pairwise set union. The main point of note is the semantics of 
assignments X <— e. It both uses its interference set argument, passing it to Ex[[e]], and 
enriches it with the interferences generated on the assigned variable X. The semantics 
of guards simply uses the interference set, while the semantics of conditionals, loops, and 
sequences is identical to the non-interference one from Fig. |4j The structured semantics of 
a thread t with interferences / is then Sx[[ body^, t}{£o, 0, /)• 

3.2.2. Program semantics. The semantics Sx[[ body^, tj still only analyzes the effect of a 
single thread t. It assumes a priori knowledge of the other threads, through /, and con- 
tributes to this knowledge, by enriching /. To solve this dependency and take into account 
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multiple threads, we iterate the analysis of all threads until interferences stabilize. Thus, 
the semantics of a multi-threaded program is: 

Pj = Q, where (O, — ) = /o ^\ 

IfpXinj) : Uer let {-,^'J') = Sx[ bodyt, tj {£o,n,I) in {n',I') ^ ' ' 

where the join U is the componentwise set union in the complete lattice V^C) x V{X). 

3.2.3. Soundness and completeness. Before linking our interference semantics Pj (by struc- 
tural induction) to the interleaving semantics P* of Sec. |3.1| (which is path-based), we remark 
that we can restate the structured interference semantics Sx[[ body^, t ]] of a single thread t in 



a path-base d fo rm, as we did in Sec. 2.4 for non-parallel programs. Indeed, we can replace S 



with Sj in (2.4) and derive a path-based semantics with interference flxl -P, tJ S Pj — > T>x 



of a set of (non-parallel) control paths P C n in a thread t as follows: 

l\x\P,t^{R,ViJ) = Ux{(SxIsn,tlo...oSx[si,tl)(i?,f],/)|si-...-s„GP} . (3.5) 
These two forms are equivalent, and Thm. [Z3] naturally becomes: 
Theorem 3.1. Mt£T, s£ stat : nx[[7r(s), tj = Sxls,tj. 
Proof. In Appendix |A. 4 D 



The following theorem then states that the semantics Pj computed with an interference 
fixpoint is indeed sound with respect to the interleaving semantics P^, that interleaves paths 
from all threads: 

Theorem 3.2. P* C Pj. 

Proof. In Appendix |A. 5 D 



The equality does not hold in general. Consider, for instance, the program fragment 
in Fig. |9(a) inspired from Dekker's mutual exclusion algorithm |24j . According to the 



interleaving semantics, both threads can never be in their critical section simultaneously. 
The interference semantics, however, does not ensure mutual exclusion. Indeed, it computes 
the following set of interferences: { (ti,flagl, 1), (t2,flag2, 1) }. Thus, in thread ti, flag2 
evaluates to {0,1}. The value comes from the initial state Sq and the value 1 comes 
from the interference {t2,iiag2,l). Likewise, flagl evaluates to {0,1} in thread ^2- Thus, 
both conditions flagl = and flag2 = can be simultaneously true. This imprecision is 
due to the flow- insensitive treatment of interferences. We now present a second example of 
incompleteness where the loss of precision is amplified by the interference fixpoint. Consider 



the program in Fig. |9(b) where two threads increment the same zero-initialized variable x. 
According to the interleaving semantics, either the value 1 or 2 is stored into y. However, in 
the interference semantics, the interference fixpoint builds a growing set of interferences, up 
to { {t,x,i) I t G {^1,^2}, i > 1}, as each thread increments the possible values written by 
the other thread. Note that the program features no loop and x can thus be incremented 
only finitely many times (twice), but the interference abstraction is flow- insensitive and 
forgets how many times an action can be performed. As a consequence, any positive value 
can be stored into y, instead of only 1 or 2. 

Our interference semantics is based on a decomposition of the invariant properties of 
parallel programs into a local invariant at each thread program point and a global inter- 
ference invariant. This idea is not new, and complete methods to do so have already been 
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£o : flagl = 


= flag2 = 


So : X = 


--y = 


thread h 


thread t2 


thread ti 


thread t2 


flagl ^ 1; 


flag2 ^ 1; 


X <— X + 1; 


X •(— X + 1 


if flag2 = then 


if flagl = then 


y ^ X 




critical section 


critical section 







(a) Mutual exclusion algorithm. (b) Parallel incrementation. 

Figure 9: Incompleteness examples for the concrete interference semantics. 



proposed. Such methods date back to the works of Owicki, Gries, and Lamport [49| I37| HO] 
and have been formalized in the framework of Abstract Interpretation by Cousot and Cousot 
|15j . We would say informally that our interference semantics is an incomplete abstraction 
of such complete methods, where interferences are abstracted in a flow-insensitive and non- 
relational way. Our choice to abstract away these information is a deliberate move that 
eases considerably the construction of an effective and efficient static analyzer, as shown in 
Sec. |3.3[ Another strong incentive is that the interference semantics is compatible with the 

Note flnally that Sec. O] 



use of weakly consistent memory models, as shown in Sec. 3.4 



will present a method to recover a weak form of flow-sensitivity (i.e., mutual exclusion) 
on interferences, without loosing the efficiency nor the correctness with respect to weak 
memory models. 



3.3. Abstract Interference Semantics Pi. The concrete interference semantics Pj in 



troduced in the previous section is defined by structural induction. It can thus be easily 
abstracted to provide an effective, always-terminating, and sound static analysis. 

We assume, as in Sec. 2.3, the existence of an abstract domain S"^ abstracting sets of 



environments — see Fig. [5 Additionally, we assume the existence of an abstract domain 
A/"^ that abstracts sets of reals, which will be useful to abstract interferences. Its signature 
is presented in Fig. 10, It is equipped with a concretization 'yx : A/"^ — )• 'P(P), a least 



element _L^, an abstract join U^ and, if it has strictly increasing infinite chains, a widening 
V_/v. We also require two additional functions that will be necessary to communicate infor- 
mation between S"^ and AA". Firstly, a function get{X,E}) that extracts from an abstract 
environment i?" E £^ the set of values a variable X G V can take, and abstracts this set 
in A^". Secondly, a function as-expr{V^) able to synthesize a (constant) expression approx- 
imating any non-empty abstract value V'^ G AA" \ {_L^}. This provides a simple way to 
use an abstract value from A"" in functions on abstract environments in £'*. For instance, 
S'JX <— as-expr{V'*) ]](-R , ^2) non-deterministically sets the variable X in the environments 
'y£{W) to any value in 7a/'(^ )• 

Any non-relational domain on a single variable can be used as AA" . One useful example 
is the interval domain [13]. In this case, an element in N"* is either _L^, or a pair consisting 
of a lower and an upper bound. The function as-expr is then straightforward because 
intervals can be directly and exactly represented in the syntax of expressions. Moreover, 
the function get consists in extracting the range of a variable from an abstract environment 
i?' € £^, an operation which is generally available in the implementations of numerical 
abstract domains, e.g., in the Apron library |36j . 
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M^ (abstract sets of reals) 

lJ\f • -^ — )• 'P(IR) (concretization function) 

-Lj^ G M' (abstract empty set) 

s.t. 7Ar(^lr) = 

uj^ -.{M^xM^)^ M^ (abstract join) 

s.t. 7Ar(V^" U^ W^) D -iM{V^) U 1m{W^) 

Vj^ : (A/"" X AAf ) ^ AA« (widening) 

s.t. -/^f{V^ Vat W^) d 7Ar(^") U -tM{W^) 

and V(Wi*)i6N : the sequence V^ = W^ f/^i = Vf Vm Wf^^ 

reaches a fixpoint V^ = V^,^ for some /c G N 
get : (V X (ftt) _). 7\^tt (variable extraction) 

s.t. 7Ar(5ei(^, i?«)) 5 { P(^) I P e 7£(^«) } 
as-expr : (AA' \ {_L^}) — )• expr (conversion to expression) 

s.t. Vp : let (y, -) = E[ as-expr{V^) } p in V D Jx{V^) 

Figure 10: Signature, soundness and termination conditions for a domain AA" abstracting 
sets of reals. 



x« "^^ (r X V) ^ AT" 




7x : T« ^ V{I) 




s.t. 7x(/«) = { (t,X,z;) 1 t G r, X G V, 


T;e7A^(/»(i,^))} 


4'=' Xit,X):±l 




4 Ui /» '=' X{t, X) : /»(t, X) UV /|(t, X) 





4vxl« '=' A(t,X):/f(t,X)VAr4(t,X) 
Figure 11: Abstract domain X"^ of interferences, derived from AA". 

We now show how, given these domains, we can construct an abstraction Pj of Pj. We 
first construct, using AA", an abstraction X" of interference sets from ViT), as presented in 



Fig. 11 It is simply a partitioning of abstract sets of real values with respect to threads 
and variables: X" = (T x V) — )■ A/"", together with pointwise concretization 71, join Uj, 
and widening Vj. Note that X" is not isomorphic to a non-relational domain on a set T x V 
of variables. Indeed, the former abstracts (T x V) — ^ P(R) ~ V{T x V x R) = V{T), while 
the latter would abstract V{{T x V) — )■ R). In particular, the former can express abstract 
states where the value set of some but not all variables is empty, while lAj- elements in 
the later coalesce to a single element representing 0. We then construct an abstraction 



Vj of the semantic domain 2?x, as presented in Fig. 12, An element of V^ is a triple 
{R% Q,, P) composed of an abstraction w E £^ of environments, a set O C i2 of errors, and 
an abstraction P G X" of interferences. The concretization 7, join u", and widening V are 
defined pointwise. 
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iR\,ni,il) u« {rI ^2, 4) = {r\ ul rI ni u ^2, if uJ: 4) 

(i?f,Oi,/f)v(i?«,j]2,4) = {r\^sr1^i^^2,4^x4) 

Figure 12: Abstract semantic domain Pj, derived from £^ and iK 

The abstract semantics Sj[[s,t]] of a statement s executed in a thread t ^ T should be 
a function from Vj- to 2?j obeying the soundness condition: 

i.e., the abstract function over-approximates the sets of environments, errors, and interfer- 



ences. Such a function is defined in a generic way in Fig. 13, The semantics of assignments 
and guards with interference is defined based on their non-interference semantics S'Js]], 
provided as part of the abstract domain £K In both cases, the expression e to assign or test 
is first modified to take interferences into account, using the apply function. This function 
takes as arguments a thread t £ T, an abstract environment R^ G ^ , an abstract interfer- 
ence l" S 2^ , and an expression e. It first collects, for each variable y G V, the relevant 
interferences Vy € A/"" from I*, i.e., concerning the variable Y and threads t' ^ t. If the 
interference for Y is empty, _L^, the occurrences of y in e are kept unmodified. If it is 
not empty, then the occurrences of Y are replaced with a constant expression encompass- 
ing all the possible values that can be read from Y, from either the interferences or the 
environments 7£-(-R'). Additionally, the semantics of an assignment X •(— e enriches P with 
new interferences corresponding to the values of X after the assignment. The semantics of 
non-primitive statements is identical to the interference-free case of Fig. [6J 



Finally, an abstraction of the interference fixpoint (3.4) is computed by iteration on 
abstract interferences, using the widening Vj to ensure termination, which provides the 
abstract semantics Pj of our program: 

P| = n, where (fi, -) = 

limX{n,P) : let Vt e r : {-,n[,4') = s[[ bodyt, tj {£ln,P) in (3.6) 

{[j{n[\tGT},liyx[jU4'\t€T}) 

where limF'^ denotes the limit of the iterates of F" starting from {$,l}j^. The following 
theorem states the soundness of the analysis: 

Theorem 3.3. Px C p[. 

Proof. In Appendix |A. 6 D 



The obtained analysis remains flow-sensitive and can be relational within each thread, 
provided that £'^ is relational. However, interferences are abstracted in a flow-insensitive 
and non-relational way. This was already the case for the concrete interferences in Pj and 
it is not related to the choice of abstract domains. The analysis is expressed as an outer 
iteration that completely re-analyzes each thread until the abstract interferences stabilize. 
Thus, it can be implemented easily on top of an existing non-parallel analyzer. Compared 
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S{lX^e,tJiRKn,li) "^^ 

let {R^',n') = S»[X ^ apply {t,Ri,P,e)j (i?»,Jl) in 
{Ri', n', imt,X) ^ li{t,X) ulget{X,Ri')]) 

S^^letxiO?,t}{Ri,n,li) = 

let {R^',n') = S»[ apply {t,RlP,e)txiO?j {R^,n) in {R^',Q',I^) 

S^[[if e^Othens, tl(ii«,J^,ltt) = 
where: 

dof 



apply {t,Ri,lKe) 

let Vy G V : y^l = uV { /*(*', y) I t' / t } in 

{Y if T/" = I " 

. tt a . «^^ a f in 

as-expr(F5» U^ 5ei(y, R^)) if F," / ±«^ 

e[Vy G V : y ^ ey] 

Figure 13: Abstract semantics of statements with interference. 

to a non-parallel program analysis, the cost is multiplied by the number of outer iterations 
required to stabilize interferences. Thankfully, our preliminary experimental results suggest 
that this number remains very low in practice — 5 for our benchmark in Sec. [5] In any case, 
the overall cost is not related to the (combinatorial) number of possible interleavings, but 
rather to the amount of abstract interferences /", i.e., of actual communications between 
the threads. It is thus always possible to speed up the convergence of interferences or, 
conversely, improve the precision at the expense of speed, by adapting the widening V^^. 

In this article, we focus on analyzing systems composed of a fixe d, f inite number of 
threads. The finiteness of 7~ is necessary for the computation of Pj in Ki.6h to be effective. 
However, it is actually possible to relax this hypothesis and allow an unbounded number 
of instances of some threads to run in parallel. For this, it is sufficient to consider self- 
interferences, i.e., replace the condition t' 7^ t in the definition Ej[[X]](t, p, /) in Fig. [t] 



(for the concrete semantics) and apply {t, R'',P,e) in Fig. 13 (for the abstract semantics) 
with t' ^ t V t & T' , where T' Q T denotes the subset of threads that can have several 
instances. The resulting analysis is necessarily uniform, i.e., it cannot distinguish different 
instances of the same thread nor express properties about the number of running instances 
— it is abstracted statically in a domain of two values: "one" {t ^ T') and "two or more" 
{t G T')- In order to analyze actual programs spawning an unbounded number of threads, 
a non-uniform analysis (such as performed by Feret |26] in the context of the vr— calculus) 
may be necessary to achieve a sufficient precision, but this is not the purpose of the present 
article. 
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3.4. Weakly Consistent Memory Semantics P^. We now review the various parallel 
semantics we proposed in the preceding sections and discuss their adequacy to describe 
actual models of parallel executions. 



It appears that our first semantics, the concrete interleaving semantics P* of Sec. 3.1 
while simple, is not realistic. A first issue is that, as noted by Reynolds in [52], such a 
semantics requires choosing a level of granularity, i.e., some basic set of operations that 
are assumed to be atomic and cannot be interrupted by another thread. In our case, we 
assumed assignments and guards (i.e., primitive statements) to be atomic. In contrast, an 
actual system may schedule a thread within an assignment and cause, for instance, x to be 1 
at the end of the program in Fig. |9(b)] instead of the expected value 2. A second issue, noted 
by Lamport in [38], is that the latency of loads and stores in a shared memory may break 
the sequential consistency in true multiprocessor systems: threads running on different 
processors may not always agree on the value of a shared variable. For instance, in the 



mutual exclusion algorithm of Fig. 9(a) , the thread ^2 niay still see the value in flagl even 
after the thread ti has entered its critical section, causing t2 to also enter its critical section, 
as the effect of the assignment flagl <— 1 is propagated asynchronously and takes some time 
to be acknowledged by t2- Moreover, Lamport noted in [39] that reordering of independent 
loads and stores in one thread by the processor can also break sequential consistency — 
for instance performing the load from flag2 before the store into flagl, instead of after, 
in the thread ti in Fig. 9(a)[ More recently, it has been observed by Manson et al. 



that optimizations in modern compilers have the same ill-effect, even on mono-processor 
systems: program transformations that are perfectly safe on a thread considered in isolation 
(for instance, reordering the independent assignment flagl -^ 1 and test flag2 = in ti) 
can cause non-sequentially-consistent behaviors to appear. In this section, we show that 
the interference semantics correctly handles these issues by proving that it is invariant 
under a "reasonable" class of program transformations. This is a consequence of its coarse, 
flow-insensitive and non-relational modeling of thread communications. 

Acceptable program transformations of a thread are defined with respect to the path- 



based semantics PI of Sec. 2.4 A transformation of a thread t is acceptable if it gives rise 



to a set 7r'(t) C 11 of control paths such that every path p' £ vr'(t) can be obtained from a 



path p G 7r{bodyf) by a sequence of elementary transformations described below in Def. 3.4 
Elementary transformations are denoted q -^ q' , where q and q' are sequences of primitive 
statements. This notation indicates that any occurrence of g in a path of a thread can be 
replaced with q' , whatever the context appearing before and after q. The transformations 
in Def. |3.4| try to account for widespread compiler and hardware optimizations, but are 
restricted to transformations that do not generate new errors nor new interferencesjj This 
ensures that an interference-based analysis of the original program is sound with respect to 



the transformed one, which is formalized below in Thm. 3.5 



The elementary transformations of Def. 3.4 require some side-conditions to hold in 



order to be acceptable. They use the following notions. We say that a variable X G V 
is fresh if it does not occur in any thread, and local if it occurs only in the currently 
transformed thread. We denote by s[e'/e] the statement s where some, but not necessarily 
all, occurrences of the expression e may be changed into e'. The set of variables appearing in 
the expression e is denoted var{e), while the set of variables modified by the statement s is 

The environments at the end of the thread after transformations may be different, but this does not 



pose a problem as environments are not observable in our semantics: P, C £ (3.31 
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lval{s). Thus, lval{X -^ e) = {X} while lval(e ixi 0?) = 0. The predicate nonhlock{e) holds 
if evaluating the expression e cannot block the program — as would, e.g., an expression 

with a definite run-time error, such as 1/0 — i.e., nonhlock{e) <;=^ \/ p ^ £ : Vp ^ ^ where 
(Fp,— ) = E[[e]]p. We say that e is deterministic if, moreover, \/p £ £ : \Vp\ = 1. Finally, 

noerror(e) holds if evaluating e is always error-free, i.e., noerror(e) <;=^ V/3 € <S : i7p = 
where {—,i}p) = E^ejp. We are now ready to state our transformations: 

Definition 3.4 (Elementary path transformations). 

(1) Redundant store elimination: X -^ ei ■ X -^ 62 -^ X -^ 62, when X ^ var{e2) and 
nonhlock{ei) . 

(2) Identity store elimination: X ■i^ X -w e. 

(3) Reordering assignments: Xi ^ ei-X2 -^ 62 -^ X2 ^ 62-^1 ^ ei, when Xi ^ var(e2), 
X2 ^ var{ei), Xi 7^ X2, and nonblock{ei). 

(4) Reordering guards: ei to 0? • 62 ixi' 0? -^ 62 co' 0? • ei txi 0?, when noerror{e2)- 

(5) Reordering guards before assignments: Xi -^ ei • 62 M 0? -^ 62 cxi 0? • Xi ^ ei, when 
Xi ^ var{e2) and either nonblock{ei) or noerror{e2)- 

(6) Reordering assignments before guards: ei [xi 0? • X2 -^ 62 -^ X2 -^ 62 • ei M 0?, when 
X2 ^ i'ar(ei), X2 is local, and noerror{e2)- 

(7) Assignment propagation: X-(— e-s ^* X-^e- s[e/X], when X ^ var{e), var{e) are 
local, and e is deterministic. 

(8) Sub-expression elimination: si • . . . • s„ -^ X -^ e • si[X/e] • . . . • s„[X/e], when X is 
fresh, Vz : var{e) n lval{si) = 0, and noerror{e). 

(9) Expression simplification: s -^ s[e'/e], when V/O G <5 : E[[e]]/j Zl E[[e']]/3 and var{e) and 
var{e') are localjj 

These simple rules, used in combination, allow modeling large classes of classic program 
transformations as well as distributed memories. Store latency can be simulated using 
rules 7 and 3. Breaking a statement into several ones is possible with rules 7 and 8. As 
a consequence, the rules can expose preemption points within statements, which makes 
primitive statements no longer atomic. Global optimizations, such as constant propagation 
and folding, can be achieved using rules 7 and 9. Rules 1-6 allow peephole optimizations. 
Additionally, transformations that do not change the set of control paths, such as loop 
unrolling, are naturally supported. 

Given the set of transformed control paths n'{t) for each thread t G T, the set of 



transformed parallel control paths tt'^ is defined, similarly to (3.2), as: 

< = { p G n, I Vt G r : projtip) G 7r'(t) } (3.7) 

and the semantics P^ of the parallel program is, similarly to ( |3.3| ): 

P; = n, where (-,f]) = E.KK'^oJ) • (3.8) 

Any original thread ir^bodyf-) being a special case of transformed thread vr'(t) (considering 
the identity transformation), we have P* ^ P^. The following theorem extends Thm. [3^ to 
transformed programs: 

Theorem 3.5. P'* C Pj. 



The original expression simplification rule from [47j required a much stronger side-condition: Ei|e| 
(ij Pi I) ^ Ex| e' I {t, p, I) for all p and I, which actually implied that e and e' were variable-free. We propose 
here a more permissive side-condition allowing local variables to appear in e and e'. 
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Proof. In Appendix |A. 7 



D 



An immediate consequence of Thms. 3.3 and 3.5 is the soundness of the abstract se- 
mantics Pj with respect to the concrete semantics of the transformed program P^, i.e., 
P* ^ Pj- Note that, in general, P^ ^ Pj. The two semantics may coincide, as for instance 



in the program of Fig. |9(a) In that case: P* C P^ = Pj. However, in the case of Fig. |9(b)| 
y can take any positive value according to the interference semantics Pj (as explained in 
Sec. 3.2), while the interleaving semantics after program transformation P^ only allows the 
values 1 and 2; we have P* = P'^ C Pj. 



Theorem 3.5 holds for our "reasonable" collection of program transformations, but may 
not hold when considering other, "unreasonable" ones. For instance, in Fig. 9(a) , flagl -^ 1 
should not be replaced by a misguided prefetching optimizer with flagl -^ 42; flagl -^ 1 in 
thread ti. This would create a spurious interference causing the value 42 to be possibly 
seen by thread ^2- If there is no other reason for ^2 to see the value 42, such as a previous 
or future assignment of 42 into flagl by ii, it would create an execution outside those 
considered by the interference semantics and invalidate Thm. 3.5 Such "out-of-thin-air" 



values are explicitly forbidden by the Java semantics as described by Manson et al. [43]. 
See also [55] for an in-depth discussion of out-of-thin-air values. Another example of invalid 
transformation is the reordering of assignments Xi ^ ei • X2 -^ 62 -^^ X2 •(— 62 • Xi ^ ei 
when ei may block the program, e.g., due to a division by zero Xi ^ 1/0. Indeed, the 
transformed program could expose errors in 62 that cannot occur in the original program 
because they are masked by the previous error in Xi ^ e\. This case is explicitly forbidden 
by the nonblock(ei) side condition in Def. |3.4[(3). The proof in Appendix A. 7 contains more 



examples of transformations that become invalid when side-conditions are not respected. 

Definition 13.41 is not exhaustive. It could be extended with other "reasonable" trans- 
formations, and some restrictive side-conditions might be relaxed in future work without 



breaking Thm. 3.5 It is also possible to enrich Def. 3.4 with new transformations that do 



not respect Thm. 3.5 as is, and then adapt the interference semantics to retrieve a similar 
theorem. For instance, we could allow speculative stores of some special value, such as 
zero, which only requires adding an interference (t, X, 0) for each thread t and each vari- 
able X modified by t. As another example, we could consider some memory writes to be 
non-atomic, such as 64-bit writes on 32-bit computers, which requires adding interferences 
that expose partially assigned values. 

Finally, it would be tempting to, dually, reduce the number of allowed program transfor- 



mations, and enforce a stronger memory consistency. For instance, we could replace Def. 3.4 
with a model of an actual multiprocessor, such as the Intel x86 architecture model proposed 
by Sewell et al. in [57], which is far less permissive and thus ensures many more proper- 
ties. We would obtain a more precise interleaving semantics P^, closer to the sequentially 
consistent one P*. However, this would not mechanically improve the result of our static 
analysis Pj, as it is actually an abstraction of the concrete interference semantics Pi, itself 
an incomplete abstraction of P* . Our choice of an interference semantics was not initially 
motivated by the modeling of weakly consistent memories (although this is an important 
side effect), but rather by the construction of an effective and efficient static analyzer. Ef- 
fectively translating a refinement of the memory model at the level of an interference-based 
analysis without sacrificing the efficiency remains a challenging future work. 
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4. Multi-threaded Programs With a Real-Time Sgheduler 

We now extend the language and semantics of the preceding section with exphcit synchro- 
nization primitives. These can be used to enforce mutual exclusion and construct critical 
sections, avoiding the pitfalls of weakly consistent memories. We also extend the semantics 
with a real-time scheduler taking thread priorities into account, which provides an alternate 
way of implementing synchronization. 

4.1. Priorities and Synchronization Primitives. We first describe the syntactic addi- 
tions to our language and introduce informally the change in semantics. 

We denote by A^ a finite, fixed set of mutual exclusion locks, so-called mutexes. The 

original language of Fig. [2] is enriched with primitives to control mutexes and scheduling as 

follows: 

stat ::= lock(7n) (mutex locking, m £ M) 

I unlock(r7T,) (mutex unlocking, m e M) ,. . 

I X ^ islocked(r?T,) (mutex testing, X eV, m e M) ' 

I yield (thread pause) 

The primitives lock(m) and unlock(m) respectively acquire and release the mutex 

m £ A4. The primitive X ^ islocked(m) is used to test the status of the mutex m: it 

stores 1 into X if m is acquired by some thread, and if it is free. The primitive yield is 

used to voluntarily relinquish the control to the scheduler. The definition of control paths 



7r(s) from (2.3) is extended by stating that 7r(s) = {s} for these statements, i.e., they are 



primitive statements. We also assume that threads have fixed, distinct priorities. As only 
the ordering of priorities is significant, we denote threads in T simply by integers ranging 
from 1 to |T|, being understood that thread t has a strictly higher priority than thread t' 
when t > t' . 

To keep our semantics simple, we assume that acquiring a mutex for a thread already 
owning it is a no-op, as well as releasing a mutex it does not hold. Our primitive mutexes 
can serve as the basis to implement more complex ones found in actual implementations. 
For instance, mutexes that generate a run-time error or return an error code when locked 
twice by the same thread can be implemented using an extra program variable for each 
mutex / thread pair that stores whether the thread has already locked that mutex. Likewise, 
recursive mutexes can be implemented by making these variables count the number of times 
each thread has locked each mutex. Finally, locking with a timeout can be modeled as a 
non-deterministic conditional that either locks the mutex, or yields and returns an error 
code. 

Our scheduling model is that of real-time processes, used noticeably in embedded sys- 
tems. Example operating systems using this model include those obeying the ARINC 653 
standard [3i (used in avionics), as well as the real-time extension of the POSIX threads 
standard [34J. Hard guarantees about the execution time of services, although an impor- 
tant feature of real-time systems, are not the purpose of this article as we abstract physical 
time away. We are interested in another feature: the strict interpretation of thread priori- 
ties when deciding which thread to schedule. More precisely: a thread that is not blocked 
waiting for some resource can never be preempted by a lower priority thread. This is unlike 
schedulers found in desktop computers (for instance, vanilla POSIX threads [34J without 
the real-time extension) where even lower priority threads always get to run, preempting 
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low priority 
lock(m); 

Z ^l; 

T ^Y - Z; 

unlock(m) 



high priority 

X ^ islocked(m); 
if X = then 

Y ^2; 
yield 



Figure 14: Using priorities to ensure mutual exclusion. 

higher priority ones if necessary. Moreover, we consider in this section that only a single 
thread can execute at a given time — this was not required in Sec. |3j This is the case, for 
instance, when all the threads share a single processor. In this model, the unblocked thread 
with the highest priority is always the only one to run. All threads start unblocked and 
may block, either by locking a mutex that is already locked by another thread, or by yield- 
ing voluntarily, which allows lower priority threads to run. Yielding denotes blocking for a 
non-deterministic amount of time, which is useful to model timers of arbitrary duration and 
waiting for external resources. A lower priority thread can be preempted when unlocking 
a mutex if a higher priority thread is waiting for this mutex. It can also be preempted at 
any point by a yielding higher priority thread that wakes up non-deterministically. Thus, a 
preempted thread can be made to wait at an arbitrary program point, and not necessarily at 
a synchronization statement. The scheduling is dynamic and the number of possible thread 
inter leavings authorized by the scheduler remains very large, despite being controlled by 
strict priorities. 

This scheduling model is precise enough to take into account fine mutual exclusion 
properties that would not hold if we considered arbitrary preemption or true parallel exe- 
cutions on concurrent processors. For instance, in Fig. [T4j the high priority thread avoids a 
call to lock by testing with islocked whether the low priority thread acquired the lock and, 
if not, executes its critical section and modifies Y and Z, confident that the low priority 
thread cannot execute and enter its critical section before the high priority thread explicitly 
yields. 

4.2. Concrete Scheduled Interleaving Semantics P^. We now refine the various se- 
mantics of Sec. [3] to take scheduling into account, starting with the concrete interleaving 



semantics P* of Sec. 3.1, In this case, it is sufficient to redefine the semantics of primitive 
statements. This new semantics will, in particular, exclude interleavings that do not respect 
mutual exclusion or priorities, and thus, we observe fewer behaviors. This is materialized 
by the dotted Q arrow in Fig. [T] between P^, and the refined semantics P-^ we are about to 
present |j 

We define a domain of scheduler states Ti as follows: 

n = (r ^ { ready, yield, wait{m) | m G 7W }) x (T ^ V{M)) . (4.2) 

A scheduler state {b, /) G 7^ is a pair, where the function b associates to each thread whether 
it is ready (i.e., it is not blocked, and runs if no higher priority thread is also ready), yielding 

Note that Fig. Ill states that each concrete semantics without scheduler abstracts the corresponding 
concrete semantics with scheduler, but states nothing about abstract semantics. Abstract semantics are 
generally incomparable due to the use of non-monotonic abstractions and widenings. 
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(i.e., it is blocked at a yield statement), or waiting for some mutex m (i.e., it is blocked at 
a lock(r?T,) statement). The fmiction / associates to each thread the set of all the mutexes 
it holds. A program state is now a pair ((6, /),/>) composed of a scheduler sta te (b , I) G H 



and an environment p £ £. The semantic domain D = V{£) x V{C) from (2.1) is thus 
replaced with 2?-^ defined as: 

v-H = v{n X £:) X v{C) (4.3) 

with the associated pairwise join U-^. 

The semantics S-^[[s,t]] of a primitive statement s executed by a thread t G T is 



described in Fig. 15, It is decomposed into three steps: enabledt, S^[[s,t]], and sched, 



the first and the last steps being independent from the choice of statement s. Firstly, the 
function enabledt filters states to keep only those where the thread t can actually run, i.e., 
t is the highest priority thread which is ready. Secondly, the function S^[[s,t]] handles 
the statement-specific semantics. For yield, lock, and unlock statements, this consists 
in updating the scheduler part of each state. For lock statements, the thread enters a 
wait state until the mutex is available. Actually acquiring the mutex is performed by the 
following sched step if the mutex is immediately available, and by a later sched step following 
the unlocking of the mutex by its owner thread otherwise. The islocked statement updates 
each environment according to its paired scheduler state. The other primitive statements, 
assignments and guards, are not related to scheduling; their semantics is defined by applying 
the regular, mono-threaded semantics S[[s]] from Fig. El to the environment part, leaving 
the scheduler state unchanged. Thirdly, the function sched updates the scheduler state by 
waking up yielding threads non-deterministically, and giving any newly available mutex to 
the highest priority thread waiting for it, if any. 

The semantics ITI-^ [[ P ]] G Vy^ — ^ D-^ of a set P C n,,, of parallel control paths then 
becomes, similarly to ( |3.1[ ): 

U^ { (S«[S„, t„l o . . . o S^Jsi, ti 1)(P, ^) I (Si, ti) • . . . • {Sn,tn) G P] ^4-4^ 



and the semantics P-^ of the program is, similarly to (3.3): 



P^ = n, where (-,^2) = H^Ivr, l({/io} x S^, 0) (4.5) 



where tt^, is the set of parallel control paths of the program, defined in (|3.2[), and /iq = (At : 



ready, Xt : 0) denotes the initial scheduler state (all the threads are ready and hold no 



mutex). As in Sec. 3.1 , many parallel control paths in tt^, are unfeasible, i.e., return an empty 
set of environments, some of which are now ruled out by the enabledt function because they 
do not obey the real-time scheduling policy or do not ensure the mutual exclusion enforced 
by locks. Nevertheless, errors from a feasible prefix of an unfeasible path are still taken into 
account. This includes, in particular, the errors that occur before a dead- lock. 

4.3. Scheduled Weakly Consistent Memory Semantics P^. As was the case for the 



interleaving semantics without a scheduler (Sec. 3.1), the scheduled interleaving semantics 
does not take into account the effect of a weakly consistent memory. Recall that a lack of 
memory consistency can be caused by the underlying hardware memory model of a multi- 
processor, by compiler optimisations, or by non-atomic primitive statements. While we can 
disregard the hardware issues when considering mono-processor systems (i.e., everywhere in 
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S-^[[s,t]] = schedoS\^\s,t'^o enahledt 

where: 

enabledt{R, n) = ({ ((fe, 0, p)(^R\ bit) = ready AVf > t : b{t') / ready }, Q) 

Sj,[yield,tK^,^) = i{mt^yi(^ld],l),p)\i{b,l),p)GR},n) 
Sl^llock{m),tj{R,n) = {{{{b[t^wait{m)],l),p) \ i{b,l),p) £ R},n) 
S;[unlock(m),tK^,^) = {{{{b,l[t ^ l{t)\{m}]), p) \ {{b,l), p) e R},n) 

{{i{b,l),p[X^O]) I {{b,l), p) e R,W e T : m ^ l{t')} U 
{ ((6, l),p[X ^ 1]) I ((6, l),p) G i?, 3i' G r : m G /(f) }, n) 

for all other primitive statements sG {X-^e,eixiO?} : 

sli s,tUR,n) "^^ 

({ ((6,0, p') I 3p ■■ {ib,l),p) G ii, (R',-) = Slsj{{p},n), p' G R'},n') 
where (-,0') = S[sK{/0 I (-,p) e i?},^^) 

schediR,n) = {{{ib'J'),p) I ((6,0,P) Gii},S^) 
s.t. Vt : 

if 6(t) = wait{m) A (m G /(i) V (Vt' : m ^ l{t') aW > t : b{t') / wait{m))) 

then 5'(t) = ready M'{t) = l{t) U {m} 

else r(t) = l{t) A (6'(i) = 6(i) V (6'(i) = ready A6(t) = yield)) 

Figure 15: Concrete semantics of primitive statements with a scheduler. 



Sec. El except Sec. 4.4.5) the other issues remain, and so, we must consider their interaction 



with the scheduler. Thus, we now briefly present a weakly consistent memory semantics for 



programs with a scheduler. The interference semantics designed in Sees. 4.4 and 4.5 will be 
sound with respect to this semantics. 

In addition to restricting the interleaving of threads, synchronization primitives also 
have an effect when considering weakly consistent memory semantics: they enforce some 
form of sequential consistency at a coarse granularity level. More precisely, the compiler 
and processor handle synchronization statements specially, introducing the necessary flushes 
into memory and register reloads, and refraining from optimizing across them. 

Recall that the weakly consistent semantics P^ of Sec. |3.4| is based on the interleav- 



ing semantics P,,, of Sec. 3.1 applied to transformed threads vr (t), which are obtained by 
transforming the paths in Ti{body^) using elementary path transformations q -^ q' from 
Def. |3.4[ To take synchronization into account, we use the same definition of transformed 
threads '/r'(t), but restrict it to transformations q -^ q' that do not contain any synchro- 
nization primitive. For instance, we forbid the application of sub-expression elimination 



(Def. 3.4 (8)) on the following path: lock(m) • y ^ e -^ X ^ e • lock(m) • y ^ X. 
However, if q and q' do not contain any synchronization primitive, and q -^ q' , then it is 
legal to replace q with q' in a path containing synchronization primitives before and after 
q. For instance, the transformation lock(m) • y ^ e -^ lock(m) • X •(— e • y •(— X is 
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lock(m) unlock(m) 

W W 



thread 1 



thread 2 



LR W R J 
lock(m) unlock(m) 

(a) Well synchronized communication. 



lock(m) 



unlock(m) 



thread 1 



thread 2 




R LR W R _ 

lock(m) unlock(m) 

(b) Weakly consistent communications. 
Figure 16: Well synchronized versus weakly consistent communications. 



acceptable. The scheduled weakly consistent memory semantics is then, based on (4.5): 
P^ ^^' n, where {-,n) = nnl<U{ho} x So, 0) 



(4.6) 
where tt^ is defined, as before (3.7), as the interleavings of control paths from all 7r'(t), t £ T- 



4.4. Concrete Scheduled Interference Semantics P^. We now provide a structured 
version Pq of the scheduled interleaving semantics P-^. Similarly to the interference ab- 
straction Pj from Sec. 3.2 of the non-scheduled interleaving semantics P^,, it is based on a 
notion of interference, it is sound with respect to both the interleaving semantics P-^ and 
its weakly consistent version P^ , but it is not complete with respect to either of them. The 
main changes with respect to the interference abstraction Pj are: a notion of scheduler 
configuration (recording some information about the state of mutexes), a partitioning of in- 
terferences and environments with respect to configurations, and a distinction between well 
synchronized thread communications and data-races. As our semantics is rather complex, 
we first present it graphically on examples before describing it in formal terms. 



4.4.1. Interferences. In the non-scheduled semantics Pj (Sec. 3.2), any interference (t, X, v), 
i.e., any write by a thread t of a value v into a variable X, could influence any read from 
the same variable X in another thread t' ^ t. While this is also a sound abstraction of 
the semantics with a scheduler, the precision can be improved by refining our notion of 
interference and exploiting mutual exclusion properties enforced by the scheduler. 

Good programming practice dictates that all read and write accesses to a given shared 



variable should be protected by a common mutex. This is exemplified in Fig. 16 (a) where 
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W and R denote respectively a write to and a read from a variable X, and all reads and 
writes are protected by a mutex m. In this example, thread 1 writes twice to X while 
holding m. Thus, when thread 2 locks m and reads X, it can see the second value written 
by thread 1, but never the first one, which is necessarily overwritten before thread 2 acquires 
m. Likewise, after thread 2 locks m and overwrites X, while it still holds m it can only 
read back the value it has written and not any value written by thread 1. Thus, a single 
interference from thread 1 can effect thread 2, and at only one read position; we call this 
read / write pair a "well synchronized communication." Well synchronized communications 
are flow-sensitive (the order of writes and reads matters), and so, differ significantly from 



the interferences of Sec. 3.2 In practice, we model such communications by recording at 
the point unlock(?7z) in thread 1 the current value of all the variables that are modified 
while m is locked, and import these values in the environments at the point lock(m) in 
thread 2. 

Accesses are not always protected by mutexes, though. Consider, for instance, the 



example in Fig. 16 (b), where X may additionally be modified by thread 1 and read by 



thread 2 outside the critical sections defined by mutex m. In addition to the well synchro- 



nized communication of Fig. 16 (a), which is omitted for clarity in Fig. 16, (b), we consider 
that a write from thread 1 effects a read from thread 2 if either operation is performed 
while m is not locked. These read / write pairs correspond to data-races, and neither the 
compiler nor the hardware is required to enforce memory consistency. We call these pairs 
"weakly consistent communications." In practice, these are handled in a way similar to 
the interferences in Sec. 13.21 the values thread 1 can write into X are remembered in a 
flow- insensitive interference set, and the semantics of expressions is modified so that, when 
reading X in thread 2, either the thread's value for X or a value from the interference set 
is used. We also remember the set of mutexes that threads hold during each read and each 
write, so that we can discard communications that cannot occur due to mutual exclusion. 



For instance, in Fig. 16 (b), there is no communication of any kind between the first write 
in thread 1 and the second read in thread 2. The example also shows that well synchro- 
nized and weakly consistent communications can mix freely: there is no weakly consistent 
communication between the second write in thread 1 and the second read in thread 2 due 
to mutual exclusion (both threads hold the mutex m); however, there is a well synchronized 
communication — shown in Fig. 16 (a). 



Figure ^7\ illustrates the communications in the case of several mutexes: ml and rn-2. 



In Fig. 17 (a), weakly consistent communications only occur between write / read pairs 
when the involved threads have not locked a common mutex. For instance, the first write 
by thread 1 is tagged with the set of locked mutexes {ml}, and so, can only infiuence the 
first read by thread 2 (tagged with 0) and not the following two (tagged respectively with 
{rn-1} and {ml,m2}). Likewise, the second write, tagged with {ml,m2}, only influences 
the flrst read. However, the third write, tagged with only {m2}, influences the two first 



reads (thread 2 does not hold the mutex rn-2 there). In Fig. 17 (b), well synchronized 
communications import, as before, at a lock of mutex ml (resp. rn-2) in thread 2, the 
last value written by thread 1 before unlocking the same mutex ml (resp. m2). The well 



synchronized communication in Fig. 17 (c) is more interesting. In that case, thread 1 unlocks 
m2 before ml, instead of after. As expected, when thread 2 locks ml, it imports the last 
(third) value written by thread 1, just before unlocking m,l. We note, however, that the 
second write in thread 1 does not influence thread 2 while thread 2 holds mutex ttiI, as the 
value is always over- written by thread 1 before unlocking ml. We model this by importing. 
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(a) Weakly consistent communications. 
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(b) Well synchronized communications. 
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(c) Well synchronized communications. 

Figure 17: Well synchronized and weakly consistent communications with two locks. 

when locking a mutex m in thread 2, only the values written by thread 1 while it does not 
hold a common mutex (in addition to m) with thread 2. Thus, when locking m2 while 
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still holding the mutex ml, thread 2 does not import the second value written by thread 1 
because thread 1 also holds ml during this write. 

4.4.2. Interference partitioning. To differentiate between well synchronized and weakly con- 
sistent communications, and to avoid considering communications between parts of threads 
that are in mutual exclusion, we partition interferences with respect to a thread-local view 
of scheduler configurations. The (finite) set C of configurations is defined as: 

C = V(M) X V{M) X { weak, sync{m) \ m e M} . (4.7) 

In a configuration {l,u,s) G C, the first component / C Jii denotes the exact set of mutexes 
locked by the thread creating the interference, which is useful to decide which reads will 
be affected by the interference. The second component u C 7W denotes a set of mutexes 
that are known to be locked by no thread (either current or not). This information is 
inferred by the semantics of islocked statements and can be exploited to detect extra 
mutual exclusion properties that further limit the set of reads affected by an interference 



(as in the example in Fig. 14). The last component, s, allows distinguishing between 
weakly consistent and well synchronized communications: weak denotes an interference that 
generates weakly consistent communications, while sync{m) denotes an interference that 
generates well synchronized communications for critical sections protected by the mutex m. 
These two kinds of interferences are naturally called, respectively, weakly consistent and 
well synchronized interferences. The partitioned domain of interferences is then: 

I = TxCxVxR (4.8) 



which enriches the definition of I from Sec. 3.2 with a scheduler configuration in C. The 



interference {t, c, X,v) G X indicates that the thread t ^ T can write the value ?; G R into 
the variable X € V and the scheduler is in the configuration c G C at the time of the write. 

4.4.3. Environment partitioning. When computing program states in our semantics, envi- 
ronments are also partitioned with respect to scheduler configurations in order to track 
some information on the current state of mutexes. Thus, our program states associate an 
environment p G <S and a configuration in {l,u,s) G C, where the configuration {l,u,s) 
indicates the set of mutexes / held by the thread in that state, as well as the set of mutexes 
u that are known to be held by no thread; the s component is not used and always set by 
convention to weak. The semantic domain is now: 

Vc = V{C x£)x V{C) X r{I) (4.9) 

partially ordered by pointwise set inclusion. We denote by Uc the associated pointwise join. 
While regular statements (such as assignments and tests) update the environment part of 
each state, synchronization primitives update the scheduler part of the state. 

The use of pairs of environments and scheduler configurations allows representing rela- 
tionships between the value of a variable and the state of a mutex, which is important for the 
precise modeling of the islocked primitive in code similar to that of Fig. [T4j For instance, 
after the statement X ^ islocked(?ri), all states {{l,u,s), p) satisfy p{X) = =^ m £ u. 
Thus, when the high thread enters the "then" branch of the subsequent X = test, we 
know that m is not locked by any thread and we can disregard the interferences generated 
by the low thread while holding m. 
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EcJel : (r X C X ^ X V{Z)) -^ (P(R) x P(£)) 



dof 



EclXj{t,c,p,I) = i{p{X)}U {v \3it',c',X,v) G I -.t ^t' Aintf{c,c')}J) 



dcf 



Ecl[ci,C2]Jit,C,p,I) = ({CGR|C1<C<C2}, 0) 



dcf 



Eel -i eUt,c,p,I) = let {V,n) = EcM {t,c,p,I) in ({ -x | x G V}, ft) 

EcleiOie2}{t,c,p,I) = 

let (^1,^1)= Eclei}{t,c,p,I)m 
let (^2,5^2)= Ecle2J{t,c,p,I)m 

({xiOX2 I Xi G Vi, X2 G ^2, o / / VX2 / 0}, 

rJiUJl2U{^|o = /A0G 1/2}) 
where og{+,-,x,/} 

where: 

intf{{l,u,s),{l',u',s')) 44 /n/' = un/' = ^'0/ = 0As = s' = weaA; 

Figure 18: Concrete scheduled semantics of expressions with interference. 
AAA. Semantics. We now describe in details the semantics of expressions and statements. 



It is presented fully formally in Figs. 18 and 19 



The semantics Ec [[ e ]] (t, c, p, /) of an expression e is presented in Fig. 18 , It is similar to 
the non-scheduled semantics Ex|{e]](t, p, /) of Fig. [TJ except that it has an extra argument: 



the current configuration c G C (4.7) of the thread evaluating the expression. The other 



arguments are: the thread t &T evaluating the expression, the environment p G f in which 



it is evaluated, and a set I Q I (4.8) of interferences from the whole program. Interfer- 
ences are applied when reading a variable Ec [[ X ]] . Only weakly consistent interferences are 
handled in expressions — well synchronized interferences are handled in the semantics of 
synchronization primitives, presented below. Moreover, we consider only interferences with 
configurations that are not in mutual exclusion with the current configuration c. Mutual ex- 
clusion is enforced by the predicate intf , which states that, in two scheduler configurations 
(/, n, weak) and (/', u' , weak) for distinct threads, no mutex can be locked by both threads 
(/ n /' = 0), and no thread can lock a mutex which is assumed to be free by the other one 
(/rin' = /'rin = 0). The semantics of other expression constructs remains the same, passing 
recursively the arguments t, c, and / unused and unchanged. 

We now turn to the semantics Sc [[ s, t ]] {R, fi, /) of a statement s executed by a thread t, 



which is defined in Fig. 19 It takes as first argument a set R of states which are now pairs 
consisting of an environment p £ £ and a scheduler configuration c G C, i.e., R C C x £. 
The other arguments are, as in the non-scheduled semantics of Fig. [8j a set of run-time 
errors il C £ to enrich, and a set of interferences / C X to use and enrich. The semantics of 
assignments and tests in Fig. 19 is similar to the non-scheduled case (Fig. ^. The scheduler 



configuration associated with each input environment is simply passed as argument to the 
expression semantics Ec in order to select precisely which weakly relational interferences 
to apply (through intf), but it is otherwise left unmodified in the output. Additionally, 
assignments X -^ e generate weakly consistent interferences, which store in / the current 
thread t and the scheduler configuration c of its state, in addition to the modified variable 
X and its new value. 
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Scls,tj:Vc^Vc 



SclX^e,tj{R,n,I) = 

(0,O,I)Uc Uc let{V,n')= Eclej{t,c,p,I)m 

ic,p)&R {{{c,p[X^v])\veV}, n', { {t, c,X,v)\veV}) 

Scle>ci07,t}{R,n,I) = 

{9,n,I)Uc Uc let{V,n')= Eclejit,c,p,l)m 
{c,p)&R {{{c,p)\3veV ■.v>^0}, n', 0) 

Scl'ii e:x^Othens,tj{R,n,I) = 

{Scls,tjoSclet^O?, tj){R,n,I)UcSclet^O'!, tj{R,n,I) 

Sc[[whileeMOdos, tj{R,n,I) 



dof 



Scle^O?,tJilfpXX:iR,n,I)Uc{Scls,tjoSclet<iO?,tj)X) 



Sclsi;s2,tj{R,n,I) ""^^ {Scls2,tjoSclsutj){R,n,I) 

Scllock{m),t}{R,n,I) = 

({ ((/ U {m}, 0, s),p') I ((/, -, s),p)G R, p' £ in{t, I, 0, m, p, I) }, 
n, IU[J{ out{t,l,^,m',p,I) I 3m : {{l,u,-),p) e RAm' eu}) 

Sclunlock{m),t}{R,n,I) = 

{{{{l\{m},u,s),p) I {{l,u,s),p) £ R}, 
Q, lLl[j{ out{t,l\{m},u,m,p,I) \ {{l,u,-),p) e R}) 

Sclyield,tJiR,n,I) = 

{{{il,(^,s),p)\{{l,-,s),p)eR}, 
Q, IU[j{ out{t,l,^,m',p,I) I 3u : {{l,u,-),p) e RAm' eu}) 

SclX ^ islocked(m), tj{R,n,I) = 
if no thread t' > t locks m, then: 

{{{{l,uU{m},s),p'[X H> 0]) I {{l,u,s),p) e R, p' € in{t,l,u,m, p,I)}U 

{i{l,u\{m},s),p[X ^ 1]) \ {il,u,s), p) G R}, 

n, IU{{t,c,X,v) I uGJO,!}, (c,-) eR}) 
otherwise: 

Sclx^[o,i],tj{R,n,i) 

where: 

in{t,l,u,m, p, I) = 

{p' \yX eV : p'iX) = p{X)\/ (3t',l',u' : {t' , {I' , u' , sync{m)) , X , p' (X)) el 

At ^ t' A I m' = m u' = r n u = 9)} 

out{t,l,u,m, p,I) = 

{ {t, (l, u, sync{m)),X, p{X)) \ 31' : (t, (/', -, weak),X, -) el Am el'} 

Figure 19: Concrete scheduled semantics of statements with interference. 

The semantics of non-primitive statements remains the same as in previous semantics 
by structural induction on the syntax of statements (e.g., Fig. ^. 

The main point of note is thus the semantics of synchronization primitives. It updates 
the scheduler configuration and takes care of well synchronized interferences. 
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Let US explain first how the scheduler part {l,u,s) G C of a state {{l,u,s),p) £ R is 
updated. Firstly, the set / of mutexes held by the current thread is updated by the primitives 
lock(m) and unlock(7n) by respectively adding m to and removing m from /. Secondly, the 
set of mutexes u that are known to be free in the system is updated hj X -^r- islocked(m-). 
Generally, no information on the state of the mutex is known a priori. Each input state 
thus spawns two output states: one where m is free (n G m), and one where m is not 
free (n ^ m). In the first state, X is set to while, in the second state, it is set to 1. 
As a consequence, although the primitive cannot actually infer whether the mutex is free 
or not, it nevertheless keeps the relationship between the value of X and the fact that m 



is free. Inferring this relation is sufficient to analyze precisely the code in Fig. 14 It is 
important to note that the information in ?tt, G n is transient as, when a context switch 
occurs, another thread t' can run and lock m, thus invaliding the assumption by thread t 
that no thread has locked m. We distinguish two scenarios, depending on whether t' has 
higher priority than t or not. When t' < t, the thread t' has lower priority and cannot 
preempt t at an arbitrary point due to the real-time nature of the scheduler. Instead, t' 
must wait until t performs a blocking operation (i.e., calls a lock or yield primitive) to 
get the opportunity to lock m. This case is handled by having all our blocking primitives 
reset the u component to 0. When t' > t, the thread t' can preempt t at arbitrary points, 
including just after the islocked primitive, and so, we can never safely assume that m G n. 
If this scenario is possible, X ^ islocked(m-) is modeled as X -^r- [0, 1], without updating 
u. To decide which transfer function to use for islocked, we need to know the set of all 
mutexes than can be locked by each thread. It is quite easy to enrich our semantics to 
track this information but, as it is cumbersome, we did not include this in Fig. 19 — one 
way is to add a new component M : T — s- V{M) to the domain I of interferences, in 
which we remember the set of arguments m of each lock(r?i) encountered by each thread; 
then, we check that /Bt' > t : m G M{t') before applying the precise transfer function for 
X ^ islocked(m) in thread t. 

We now discuss how synchronization primitives handle well synchronized interferences. 
We use two auxiliary functions, in{t, I, u, m, p, I) and out{t, I, u, m, p, I), that model respec- 
tively entering and exiting a critical section protected by a mutex m G A^ in a thread 
t G T. The arguments l,u ^ Ai reflect the scheduler configuration when the primitive is 
called, i.e., they are respectively the set of mutexes held by thread t and those assumed to 
be free in the system. The function out{t, I, u, m, p, I) collects a set of well synchronized in- 
terferences from an environment p £ £. These are constructed from the current value p{X) 
of the variables X that have been modified while the mutex m was held. Such informa- 
tion can be tracked precisely in the semantics by adding another component in C — )■ 'P(V) 
to our program states R but, for the sake of simplicity, the semantics we present simply 
extracts this information from the interferences in /: we consider all the variables that 
have some weakly interference by thread i in a configuration where it holds m [m G /). 
This may actually over- approximate the set of variables we seek as it includes variables 
that have been modified in previous critical sections protected by the same mutex m, but 
not in the current critical section}^ Given a variable X, the interference we store is then 
(i, {l,u, sync{m)),X, p{X)). The function in{t,l,u,m, p, I) applies well synchronized inter- 
ferences from / to an environment p: it returns all the environments p' that can be obtained 



Our prototype performs the same over-approximation for the sake of keeping the analysis simple, and we 
did not find any practical occurrence where this resulted in a loss of precision. We explain this by remaking 
that critical sections delimited by the same mutex tend to protect the same set of modified variables. 
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from p by setting one or several variables to their interference value. It only considers well 
synchronized interferences with configuration sync{m) and from threads t' ^ t. Moreover, 
it uses a test similar to that of intf to avoid applying interferences that cannot occur due to 
mutual exclusion, by comparing the current state of mutexes (/ and u) to their state when 
the interference was stored. 

The function pair in / out is actually used to implement two kinds of critical sections. 
A first kind stems from the use of lock(m) and unlock(?Ti) statements, which naturally 
delimit a critical section protected by m. Additionally, whenever a mutex m is added to the 
u scheduler component by a primitive X ^ islocked(m), we also enter a critical section 
protected by m. Thus, in is called for mutex m, and intf ensures that weakly synchronized 
interferences where m is locked are no longer applied. Such critical sections end when m 
leaves u, that is, whenever the thread executes a blocking primitive: lock or yield. These 
primitives call out for every mutex currently in n, and reset n to in the program state. 

Finally, we turn to the semantics Pq of a program, which has the same fixpoint form 



as Px (3.4): 



Pc = i^, where (Q,-) = lfp\{VL,I) : ,^ -^^^ 

Uter let (-,^',1') = Scl hodyt, t\ ({cq} x £o,nj) in {^',1') ^ " ' 

where the initial configuration is cq = {9,9, weak) £ C. This semantics is sound with 



respect to that of Sees. 4.2-4.3 



Theorem 4.1. P^ C P^ and P^ C P^. 

Proof. In Appendix |A. 8 D 



4.4.5. Multiprocessors and non-real-time systems. The only part of our semantics that 
exploits the fact that only one thread can execute at a given time is the semantics of 
X ^ islocked(?7i). It assumes that, after the current thread has performed the test, the 
state of the mutex m cannot change until the current thread calls a blocking primitive (lock 
or yield) — unless some higher priority thread can also lock the mutex m. Thus, in order 
to obtain a semantics that is also sound for truly parallel or non-real-time systems, it is 
sufficient to interpret all statements X ^ islocked(?Ti) as X -^ [0, 1]. 

While more general, this semantics is less precise when analyzing a system that is known 
to be mono-processor and real-time. For instance, this semantics cannot prove that the two 



threads in Fig. 14 are in mutual exclusion and that, as a result, T = at the end of the 
program. It finds instead T S {—1,0, 1}, which is less precise. As our target application 
(Sec. [5]) is mono-processor and real-time, we will not discuss this more general but less 
precise semantics further. 

4.4.6. Detecting data-races. In our semantics, data-races silently cause weakly consistent 
interferences but are otherwise not reported. It is easy to modify the semantics to out- 
put them. Write / write data-races can be directly extracted from the computed set of 



interferences / gathered by the least fixpoint in (4.10) as follows: 

{ (t, t', X) e r X r X V I 3c, c : {t, c,X,-)eI A (t', c,X,-)elAt^t'A intf{c, c) } 

is a set where each element {t,t',X) indicates that threads t and t' may both write into 
X at the same time. Read / write data-races cannot be similarly extracted from / as the 
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^cnM -.{TxCx V{Z)) ^ViTxTxV) 



dof 



Ec7^[Xl(t,c,/) = {{t,t',X) I 3c' : it',c',X,-) G I At^t' Aintficc')} 
Ecnl[ci,C2]Jit,c,I) 



dof 



dcf 



Ecnl-iej{t,c,I) = Ecnlej{t,c,I) 



dcf 



EcTileio,e2Kt,c,I) = Ecnleij it,c,I)UEcTile2}it,c,I) 
where o £ {+, -, x,/} 

Figure 20: Read / write data-race detection. 

£q:X = Y = 5 



thread t 



while = do 
lock(r?T,); 
if X > then 

X ^ X -1; 

Y ^Y -I; 
unlock(m,) 



thread t2 

while = do 

lock(m); 

if X < 10 then 

X ^ X + 1; 

Y^Y + 1- 
unlock(m) 



Figure 21: Imprecisely analyzed program due to the lack of relational lock invariant. 



set of interferences does not remember which variables are read from, only which ones are 
written to. A simple solution is to instrument the semantics of expressions so that, during 
expression evaluation, it gathers the set of read variables that are affected by an interference. 



This is performed, for instance, by Eq-r, presented in Fig. 20 This function has the same 
arguments as Eq, except that no environment p is needed, and it outputs a set of data-races 
(f , t' , X) instead of environments and errors. 



4.4.7. Precision. The interference abstraction we use in Pc is sound be not complete with 
respect to the interleaving-based semantics P-^. In addition to the incompleteness already 



discussed in Sec. 3.2, some loss of precision comes from the handling of well synchronized 
accesses. A main limitation is that such accesses are handled in a non-relational way, 
hence Pc cannot represent relations enforced at the boundary of critical sections but broken 
within, while P-^ can. For instance, in Fig. [T4j we cannot prove that Y = Z holds outside 
critical sections, but only that Y, Z €z {1, 2}. This shows in particular that even programs 
without data-races have behaviors in Pc outside the sequentially consistent ones. However, 
we can prove that the assignment into T is free from interference, and so, that T = 0. By 



contrast, the interference semantics Pj of Sec. 3.2 ignores synchronization and would output 
Tg{— 1,0, 1}, which is less precise. 

Figure [21] presents another example where the lack of relational interference results in a 
loss of precision. This example implements an abstract producer / consumer system, where 
a variable X counts the number of resources, thread ti consumes resources {X -^ X — 1) 
if available {X > 0), and thread t2 generates resources {X <— X + 1) if there is still room 
for resources {X < 10). Our interference semantics can prove that X is always bounded in 
[0, 10]. However, it cannot provide an upper bound on the variable Y. Actually, Y is also 
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£n:X = Y = 



high thread 
X ^1; 

A^i/{Y-r 

yield 



B 
Y 



low thread 

1/X; 
1 



Figure 22: Imprecisely analyzed program due to the lack of inter-thread flow-sensitivity. 
Jfl "^^ (r X C X V) ^ A^tt 

s.t. 7x(I«) = {{t,c,X,v) \teT,ceC,X€V,vejj^{P{t,c,X))} 



± 



i^'^'x{t,c,X):±l 

Xit,c,X):ll{t,c,X)ul4{t,c,X) 



7-tt I ,tt rtt dcf 



/« vx 4 = ^(*> c, X) : /»(t, c, X) Vat I^(t, c, X) 



r»vx4 = A(t,c,X):/f( 
Figure 23: Abstract domain of scheduled interferences I^, derived from J\fK 



bounded by [0, 10] as it mirrors X. Proving this would require inferring a relational lock 
invariant: X = Y. 



Finally, Fig. 22 presents an example where the lack of inter-thread flow sensitivity 
results in a loss of precision. In this example, the high priority thread always executes first, 
until it reaches yield, at which point it allows the lower priority thread to execute. To 
prove that the expression 1/{Y — 1) does not perform an error, it is necessary to prove that 
it is executed before the low thread stores 1 into Y. Likewise, to prove that the expression 
1/X does not perform an error in the low thread, it is necessary to prove that it is executed 
after the high thread stores 1 into X. With respect to flow sensitivity, our semantics is 
only able to express that an event is performed before another one within the same thread 
(intra-thread flow sensitivity) and that a thread communication between a pair of locations 
cannot occur (mutual exclusion), but it is not able to express that an event in a thread is 
performed before another one in another thread (inter-thread flow sensitivity). 

4.5. Abstract Scheduled Interference Semantics P^. We now abstract the interference 
semantics with scheduler Pq from the preceding section in order to constr uct an effective 
static analyzer. We reuse the ideas from the abstraction Pj of Pj in Sec. 



difference is that we track precisely scheduler configurations in C (4.7), and we partition 
abstract environments and interferences with respect to them. 



3.3 The main 



As in Sec. 3.3, we assume that an abstract domain S"^ of environment sets V{£) is given 
(with signature in Fig. p|, as well as an abstract domain AA" of real sets V{R ) (with signature 



in Fig. 10). The abstract domain of interferences X", abstracting I (4.8), is obtained by 



partitioning AA" with respect to T and V, similarly to the interference domain of Fig. 11 
but also C, as shown in Fig. 23 As V, T, and C are all finite, a map from 7" x C x V 



to AA" can indeed be represented in memory, and the join Uj and widening Vj can be 
computed pointwise. Moreover, abstract environments £^ are also partitioned with respect 
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Pj = (C ^ 8^) X V{C) X X« 

s.t. 7(i2«,^,/«) = ({ (c,/>) I c G C, p G ie{RKc)) }, ^, 7x(/»)) 
(4,fii,/f)U«(i?^,f^2,4) = {\c:R\{c)u\R\{c),^iUn2,lWxll) 
(4,0i,/f) V(i?«,J]2,4) = (Ac:4(c) V£i?«(c),f^iUJ]2,/f Vxl|) 

Figure 24: Abstract domain of statements 2?^, derived from £^' and X^ 

let VcGC: (i?S,f7c) = SttjX^ app/y(t, c, /?»,/«, e)l {R\c),Q) in 
(Ac : i?», Ucec ^c, /«[Vc G C : (t,c,X) ^ I«(t,c,X) U^ 5ei(^, i?!)]) 

let VcgC: (-R^rJc) = S\ apply {t, c, R\ l"^, e) ^{)1} {R^{c),Q) in 
(Ac : rI Ueec 5^e, /«) 



Sj,[[if cMOthens, tj{R^,n,P) 



dof 



(Sf,[s,tloS^[eMO?, tl)(i?»,0,/«) Utt Sllei^O?,tj{R^,n,P) 



dcf 



SJjwhileetxiOdos, tK-^^ ^> -^*) 

Sj,[e[?^0?, tl(/imAX« :X« v((i?»,J7,/«) U« (S^[s, t] o S^[e M 0?, il)X«)) 

Sj,[si;s2,tK^»,f^,^«) = (S«,[s2,tloSf,[si,tl)(i?tt,f^,/«) 

where: 

apply {t,c,Ri,P,e) = 

let Vy G V : t4 = U^ { I^t' , c', F) | f / * A mi/(c, c') } in 

V ify^ = ±' 



let Vy G V:ey = <(' , . . , ,, ,,, "'^ fin 

^ as-expr(yl U«^ 5et(y, i?tt(c))) if V^ / ±«^ 



e[Vy G V : y ^ ey] 
Figure 25: Abstract scheduled semantics of statements with interference. 



to C Hence, the abstract semantic domain P^ abstracting Dc (4.9) becomes: 

V\ = (C ^ £:«) X P(£) X X« . (4.11) 



It is presented in Fig. 24 



Sound abstract transfer functions SL derived from those in iS" (S"), are presented in 



Figs. 25-26 



Assignments and tests in Fig. 25 are very similar to the non-scheduled case Sj (Fig. 13 ) 
with two differences. Firstly, ^^ is applied pointwise to each abstract environment W'ic) G 
E* ^ c G C. New interferences due to assignments are also considered pointwise. Secondly, 
the apply function now takes as extra argument a configuration c, and then only considers 
interferences from configurations c' not in mutual exclusion with c. This is defined through 
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dof 



Sj,[lock(m), tK^",^,-^*) 

(A(/, u, s):\JI { ini{t, I', 0, m, R^{r, u', s), /«) 

I / = /' U {m} An = 0Au'CA^As = weak }, 

Sllunlock{m),tj{R^,n,li) = 

{X{1, u, s) :[JI{ R^{1', u', s)\l = r\ {m} A u = u' A s = weak }, 
n, /" u5: Ul { otii^l*' ' \ {m},u, m, R\1, u, s),I^) \ l,u C M A s = weak }) 

Sllyield, tURKn,li) ''^' 

{X{1, u, s) -.Ijli R^{1', u',s)\l = l'Au = IIiAu'CMAs = weak }, 
n, I^ u{[JI- { outfit, 1,(1}, m' , R^ {I, u, s), I^) \l,uCMAm'euAs = weak}) 

SJ[[X ^ islocked(m), tj{Ri,n,P) = 

let (i?«', -,/«')= S^[X^[0,l],tH^^^,^")in 
if no thread t' > t locks m, then: 

{X{l,u,s):[jl {let{VK-) = 

is^X ^ Oj{ini{t, r, u', m, R^{1' , u' , s), /»), 0) if m G u 
yS^X ^li{R\l',u',s),%) iim^u 

in ytt 
I I = I' A u\ {m} = u' \ {m} A s = weak }, 

n,ip) 

otherwise: 

{Ri',n,ii') 

where: 

iJ{t,l,u,m,VKl^) = 

U^ \JI { let X« = P{t', (/', u', sync(m)), X) in 

let (0',-) = S«[X ^ as-expr(Xtt)l (W,0) in 

\XGVAt^t'Alr\l' = lr\u' = l'nu = $} 

outi{t,l,u,m,Vi,I^) = 

' get{X,V^) if t = t' Ac = {l,u, sync{m)), 
X{t', c,X): { 3c' = (/', -, weak) : m £ I' A P{t, d, X) / ±51 



I * 



otherwise 
Figure 26: Abstract scheduled semantics with interference of synchronization primitives. 



the same function intj we used in the concrete semantics (Fig. 18). The semantics of non- 



primitive statements is the same as for previous semantics, by structural induction on the 
syntax of statements. 

The semantics of synchronization primitives is presented in Fig.[26j It uses the functions 
iiv and ouv' which abstract, respectively, the functions in and out presented in Fig. 19 As 
their concrete versions, w and ouv' take as arguments a current thread i G T, a mutex 
m ^ M. protecting a critical section, and sets of mutexes l,u^ M. describing the current 
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scheduling configuration. Moreover, they take as arguments an abstract set of interferences 
/'' € X" instead of a concrete set, and an abstract set of environments V'^ G £^ instead 



of a single concrete one. The function owt" uses get (Fig. 10) to extract abstract sets of 
variable values from abstract environments and construct new abstract well synchronized 
interferences. The function in* applies these interferences to an abstract environment by 
converting them to an expression (using as-expr) and updating the value of variables (using 
an assignment in S"). Additionally, the semantics of synchronization primitives models 
updating the scheduler configuration from c' = {l',u', weak) to c = {l,u, weak) by moving 
abstract environments from R*{c') into i?''(c); when partitions are collapsed, all the abstract 
environments mapped to the same configuration c are merged into -R'(c) using U^.. Finally, 
the abstract analysis P^ computes a fixpoint with widening over abstract interferences. 



which is similar to (3.6): 



Pj, = n, where {Q, -) = 

Urn X{n, I^) : let \fter:{-, n't, ll')= S^l body^, tj {Rl,n,I^) in (4.12) 

i[j{n't\teT},PyxljUl!'\teT}) 



where the partitioned initial abstract environment -Rg G C — t- i5* is defined as: 



xx-'t 





ott dot , 
Rq = Ac : 



if c = (0, 0, weak) 
otherwise 



The resulting analysis is sound: 

Theorem 4.2. Pc ^ P^. 

Proof. In Appendix |A. 9 D 



Due to partitioning, P^ is less efficient than Pj. The abstract semantic functions for 
primitive statements, as well as the join u" and widening V, are performed pointwise on 
all configurations c £ C. However, a clever implementation need not represent explicitly 
nor iterate over partitions mapping a configuration to an empty environment _L| or an 
empty interference _L^. The extra cost with respect to a non-scheduled analysis has thus a 
component that is linear in the number of non-J.^- environment partitions and a component 
linear in the number of non-l.j^ interferences. Thankfully, partitioned environments are 
extremely sparse: Sec.p^shows that, in practice, at most program points, R'^{c) = _L^ except 
for a few configurations (at most 4 in our benchmark). Partitioned interferences are less 
sparse (52 in our benchmark) because, being flow-insensitive, they accumulate information 
for configurations reachable from any program point. However, this is not problematic: as 
interferences are non-relational, a larger number of partitions can be stored and manipulated 
efficiently. 

Thanks to partitioning, the precision of P^ is much better than that of Pj in the 
presence of locks and priorities. For instance, P^ using the interval domain discovers that 



T = in Fig. 14 while the analysis of Sec. 3.3 would only discover that T G [—1,1] due to 



spurious interferences from the high priority thread. 
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Figure 27: Hierarchy of abstractions in Astree and Thesee. Domains in boldface are specific 
to Thesee and not included in Astree. 



5. Experimental Results 



We implemented the abstract analysis of Sec. |4.5| in Thesee, our prototype analyzer based 
on the Astree static analyzer [8j. We first describe succinctly Astree, then Thesee, and 
finally our target application and its analysis by Thesee. 



5.1. The Astree Analyzer. Astree is a static analyzer that checks for run-time errors 
in embedded C programs. Astree accepts a fairly large subset of C, excluding notably 
dynamic memory allocation and recursion, that are generally unused (or even forbidden) in 
embedded code. Moreover, Astree does not analyze multi-threaded programs, which is the 
very issue we address in the present article. 

The syntax and semantics assumed by Astree are based on the C99 norm [35j, supple- 
mented with the IEEE 754-1985 norm for floating-point arithmetics I^M- The C99 norm 
underspecifies many aspects of the semantics, leaving much leeway to compiler implemen- 
tations, including random undocumented and unpredictable behaviors in case of an error 
such as an integer overflow. A strictly conforming program would rely only on the semantics 
defined in the norm. Few programs are strictly conforming; they rely instead on additional, 
platform-specific semantic hypotheses. This is especially true in the embedded software 
industry, where programs are designed for a specific, well-controlled platform, and not for 
portability. Thus, Astree provides options to set platform-specific semantic features, such 
as the bit-size and byte-ordering of data- types, and the subsequent analysis is only sound 
with respect to these hypotheses. The run-time errors checked by Astree are: overfiows in 
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integer arithmetics and casts, integer divisions by zero, invalid bit-shifts, infinities and Not 
a Number floating-point values (caused by overflows, divisions by zero, or invalid opera- 
tions), out-of-bound array accesses, invalid pointer arithmetics or dereferences (including 
null, dangling, and misaligned pointers), and failure of user-defined assertions (specified in 
a syntax similar to the standard assert function). 

Astree takes as input the C source after preprocessing by a standard preprocessor 
and a configuration file describing the ranges of the program inputs (such as memory- 
mapped sensors) if any. It then runs fully automatically and outputs a list of alarms 
corresponding to potential errors, and optionally program invariants for selected program 
points and variables. Astree is sound in that it computes an over-approximation of all 
program traces, for all input scenarios. Moreover, the analysis continues even for erroneous 
program traces if the behavior after the error has a reasonable semantics. This is the case 
after integer overfiows, for instance, using a wrap-around semantics, but it is not the case 
after dereferencing a dangling pointer, which has truly unpredictable results. In all cases, 
when there is no alarm, or when all the alarms can be proved by other means to be spurious, 
then the program is indeed proved free of run-time error. 

Although Astree accepts a large class of C programs, it cannot analyze most of them 
precisely and efficiently. It is specialized, by its choice of abstractions, towards control / 
command aerospace code, for which it gives good results. Thanks to a modular design, 
it can be adapted to other application domains by adding new abstractions. Actually, 
the initial specialization towards control / command avionic software [H] was achieved by 
incrementally adding new domains and refining existing ones until all false alarms could be 
removed on a target family of large control software from Airbus (up to 1 M lines) [23j. 
The resulting analyzer achieved the zero false alarm goal in a few hours of computation 
on a standard 2.66 GHz 64-bit Intel server, and could be deployed in an industrial context 
[23]. This specialization can be continued with limited effort, at least for related application 
domains, as shown by our case study on space software |9]. Astree is now a mature tool 
industrialized by Absint [I] . 



Figure 27 presents the design of Astree as a hierarchy of abstract domains — we ignore 
for now boldface domains, which are specific to Thesee. Actually, Astree does not contain 
a single "super-domain" but rather many small or medium-sized domains that focus on a 
specific kind of properties each, possess a specific encoding of these properties and algorithms 
to manipulate them, and can be easily plugged in and out. One of the first domain included 
in Astree was the simple interval domain [13j that expresses properties of the form X E [a, b] 
for every machine integer and floating-point variable X £ V. The interval domain is key 
as it is scalable, hence it can be applied to all variables at all program points. Moreover, 
it is able to express sufficient conditions for the absence of many kinds of errors, e.g., 
overflows. Astree also includes relational domains, such as the octagon domain [l6j able to 
infer relations of the form zizX zizY<c. Such relations are necessary at a few locations, 
for instance to infer precise loop invariants, which then lead to tighter variable bounds. 
However, as the octagon domain is less scalable, it is used only on a few variables, selected 
automatically by a syntactic heuristic. Astree also includes abstract domains speciflc to the 
target application domain, such as a domain to handle digital filtering featured in many 
control / command applications ^7\. The computations are performed in all the domains 
in parallel, and the domains communicate information through a partially reduced product 
|18j . so that they can improve each other in a controlled way — a fully reduced product. 
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where all domains communicate all their finds, would not scale up. Additionally to numeric 
variables, the C language features pointers. Pointer values are modeled in the concrete 
semantics of Astree as semi-symbolic pairs containing a variable name and an integer byte- 
offset. The pointer abstract domain is actually a functor that adds support for pointers 
to any (reduced product of) numerical abstract domain(s) by maintaining internally for 
each pointer a set of pointed-to variables, and delegating the abstraction of the offset to 
the underlying numerical domain(s) (associating a synthetic integer variable to each offset). 
Another functor, the memory domain, handles the decomposition of aggregate variables 
(such as arrays and structures) into simpler scalar variables. The decomposition is dynamic 
to account for the weak type system of C and the frequent reinterpretation of the same 
memory portions as values of different types (due to union types and to type-punning). 
Both functors are described in [35] . Finally, a trace partitioning domain |44j adds a limited 
amount (for efficiency) of path-sensitivity by maintaining at the current control point several 
abstract states coming from execution traces with a different history (such as which branches 
of if statements were taken in the past). The computation in these domains is driven by an 
iterator that traverses the code by structural induction on its syntax, iterating loops with 
widening and stepping into functions to achieve a fully flow- and context-sensitive analysis. 

More information and pointers about Astree can be found in [7]. 
5.2. The Thesee Analyzer. Thesee is a prototype extension of Astree that uses the ab- 



stract scheduled interference semantics of Sec. 4.5 to support the analysis of multi-threaded 



programs. Thesee checks for the same classes of run-time errors as Astree. Additionally, 
it reports data-races, but ignores other parallel-related hazards, such as dead-locks and 
priority inversions, that are not described in our concrete semantics. 

Thesee benefited directly from Astree's numerous abstract domains and iteration strate- 



gies targeting embedded C code. Figure 27 presents the design of Thesee, where non- 



boldface domains are inherited from Astree and boldface ones have been added. 

Firstly, the memory domain has been modified to compute the abstract interferences 
generated by the currently analyzed thread and apply the interferences from other threads. 
We use the method of Fig. [25} the memory domain dynamically modifies expressions to 
include interferences explicitly (e.g., replacing variables with intervals) before passing the 
expressions to a stack of domains that are unaware of interferences. Interferences are them- 
selves stored and manipulated by a specific domain which maintains abstract sets of values. 
Non-relational abstractions from Astree, such as intervals but also abstract pointer values, 
are directly exploited to represent abstract interferences. 

Secondly, a scheduler partitioning domain has been added. It maintains an abstraction 
of environments and of interferences for each abstract scheduled configuration live at the 
current program point. Then, for each configuration, it calls the underlying domain with the 
abstract environment associated to this configuration, as well as the abstract interferences 
that can effect this environment (i.e., a join of interferences from all configurations not 
in mutual exclusion with the current one). Additionally, the scheduler domain interprets 
directly all the instructions related to synchronization, which involves copying and joining 



abstract environments from different configurations, as described in Fig. 26 

Finally, we introduced an additional, parallel iterator driving the whole analysis. Fol- 
lowing the execution model of the ARINC 653 specification, the parallel iterator first exe- 
cutes the main function as a regular single-threaded program and collects the set of resources 
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(threads, synchronization objects) it creates. Then, as the program enters paraUel mode, 
the iterator analyzes each thread in sequence, and keeps re-analyzing them until their in- 
terferences are stable (in parallel mode, no new thread may be created). 

All these changes add up to approximately 10 K lines of code in the 100 K lines analyzer 
and did not require much structural change. 

5.3. Analyzed Application. Thesee has been applied to the analysis of a large industrial 
program consisting of 1.7 M lines of unpreprocessed C codqjand 15 threads, and running 
under an ARINC 653 real-time OS [3]. The analyzed program is quite complex as it mixes 
string formatting, list sorting, network protocols (e.g., TFTP), and automatically generated 
synchronous logic. 

The application performs system calls that must be properly modeled by the analyzer. 
To keep the analyzer simple, Thesee implements natively only low-level primitives to declare 



and manipulate threads as well as simple mutexes having the semantics described in Sec. 4.1 
However, ARINC 653 objects have a more complex semantics. The analyzed program is 
thus completed with a 2,500-line hand-written model of the ARINC 653 standard, designed 
specifically for the analysis with Thesee. It implements all the system calls in C extended 
with Thesee primitives. The model maps high-level ARINC 653 objects to lower-level 
Thesee ones. For instance, ARINC processeq^ have a name while Thesee threads only have 
an integer identifier, so, the model keeps track of the correspondence between names and 
identifiers in C arrays and implements system calls to look up names and identifiers. It also 
emulates the ARINC semantics using Thesee primitives. For instance, a lock with a timeout 
is modeled as a non-deterministic test that either actually locks the mutex, or yields and 
returns an error code without locking the mutex. An important feature of the program we 
analyze is that all potentially blocking calls have a finite timeout, so, by construction, no 
dead-lock nor unbounded priority inversion can occur. This explains why we did not focus 
on detecting statically these issues in the present article. 

5.4. Analysis Results. At the time of writing, the analysis with Thesee of this application 
takes 27 h on our 2.66 GHz 64-bit Intel server. An important result is that only 5 iterations 
are required to stabilize abstract interferences. Moreover, there is a maximum of 52 parti- 
tions for abstract interferences and 4 partitions at most for abstract environments, so that 
the analysis fits in the 32 GB of memory of our server. The analysis currently generates 
2,136 alarms (slightly less than one alarm per 800 lines of unpreprocessed code). 

These figures have evolved before and during the writing of this article, as we improved 



the analysis. Figure 28 presents the evolution of the number of alarms on a period of 
18 months. As our improvement effort focuses on optimizing the analysis precision, we do 
not present the detailed evolution of the analysis time (it oscillates between 14 h and 28 hjj 
with a number of iterations between 4 and 7) nor the memory consumption (stable at a little 
under 30 GB). The number of alarms started at 12,257 alarms mid-2010, as reported in [71 

After preprocessing and removal of comments, empty lines, and multiple definitions, the code is 2.1 M 
lines. The increase in size is due to the use of macros. 

In the ARINC 653 [3 terminology, execution units in shared memory are called "processes"; they 
correspond to POSIX threads and not to POSIX processes [34j . 

Intuitively, adding domains and refining their precision degrades the efficiency. However, inferring tighter 
invariants can also reduce the number of loop iterations to reach a fixpoint, and so, improving the precision 
may actually lower the overall analysis time. 
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Figure 28: Evolution of the number of alarms in the analysis of our target application as 
we improved our prototype analyzer. 



int clock, ace; 

void accmnCint reset) 

{ 

static int tO; 
if (reset) { 
ace = 0; 

} 
else { 

ace += clock - tO; 

} 

to = clock; 
/* < ace < clock */ 
} 



struct t { 

int id; 

struct { char msg[23]; } x [3] ; 
} tab [12]; 

char* end_of jnsgCint ident , int pos) 

{ 

int i ; 

struct t* p = tab; 

for (i=0; i<12 && p[i] . id!=ident ; i++) ; 

char* m = p[i] .x[pos] .msg; 

for (i=0; i<23 && m[i] ; i++) ; 

/* oifset(m + i) e 4 + 292[0, 11] +96[0, 2] + [0,22] */ 

return m+i ; 
} 



(a) 



(b) 



Figure 29: Program fragments that required an improvement in the analyzer prototype. 



§ VI]. This high initial number can be explained by the lack of specialization of the analyzer: 
the first versions of Thesee were naturally tuned for avionic control / command software 
as they inherited abstract domains S"^ and A/"" from Astree, but our target application for 
Thesee is not limited to control / command processing. To achieve our current results, we 
improved the numerical, pointer, and memory domains in Thesee, and designed new ones. 



We illustrate two of these improvements in Fig. 29 A first example is the improvement of 
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transfer functions of existing domains. For instance, the function accum from Fig. [29} (a) 
accumulates, in ace, elapsed time as counted by a clock variable (updated elsewhere), 
and we need to discover the invariant ace < clock. This requires analyzing precisely 
the incrementation of ace in a relational domain. As the octagon domain we use [46] only 
supported precisely assignments involving two variables, we added new transfer functions for 
three-variable assignments (another solution, using the more powerful polyhedron domain 
|21]. did not prove scalable enough). A second example is an improvement of the pointer 



domain to precisely track pointers traversing nested arrays and structures, as in Fig. 29 (b) 
where the precise location of the pointer m+i needs to be returned. Our target application 
features similar complex data-structures and their traversal extensively. We thus added a 
non-relational integer domain for offsets of the form oq + ^^ aj[^j, hi], where the values of 
aj, ii, and hi are inferred dynamically. Note that all these improvements concern only the 
abstract domain parameters; neither the interference iterator nor the scheduler partitioning 
were refined. 

Following the design- by-refinement used in Astree f8], we have focused on the analysis 
of a single (albeit large and complex) industrial software and started refining the analyzer 
to lower the number of alarms, instead of multiplying shallower case studies. We plan to 
further improve the analysis precision in order to approach the zero false alarm goal. This 
is the objective of the AstreeA project [20], successor to Thesee. The remaining 2,136 
alarms can be categorized into three kinds. Firstly, some alarms are, similarly to the ones 



described in Fig. 29 not related to parallelism but to the imprecision of the parameter 
abstract domains. An important class of properties currently not supported by Astree nor 
Thesee is that of memory shape properties [12j. In the context of embedded software, 
dynamic memory allocation is disabled; nevertheless, our target code features dynamic 
linked lists allocated in large static arrays. Another class of properties concerns the correct 
manipulation of zero-terminated C strings. A significant part of the remaining alarms 
may be removed by designing new memory domains for these properties. Secondly, some 
alarms can be explained by an imprecise abstraction of thread interferences, similar to the 



imprecision observed in Figs. 21-22 (these examples were inspired from our target code). 
Hence the need to extend our framework to support relational and flow-sensitive abstractions 
of interferences. Thirdly, some alarms have simply not yet been fully investigated. Although 
Thesee provides verbose information on the context of each alarm as well as the thread- local 
and interference invariants, discovering the origin of alarms is a challenging task on such 
a large code: it often requires tracking the imprecision upstream and understanding the 
interplay of thread interferences. 

6. Related Work 

There are far too many works on the semantics and analysis of parallel programs to provide 
a fair survey and comparison here. Instead, we focus on a few works that are either recent 
or provide a fruitful comparison with ours. 

6.1. Interferences. The idea of attaching to each thread location a local invariant and 
handling proofs of parallel programs similarly to that of sequential programs dates back 
to the Hoare-style logic of Owicki and Gries |l9| and the inductive assertion method of 
Lamport |37| I40j . It has been well studied since; see [22] for a recent account and survey. 
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The difference between the proofs of sequential programs and that of parallel programs in 
this framework is that the local invariants of each thread must be proved invariant by the 
execution of all other threads — i.e., a non-interference check. These proof methods are 
studied from an Abstract Interpretation theoretical point of view by Cousot and Cousot 
|15j . which leads to two results: an expression of each method as a decomposition of the 
global invariant into thread-local invariants, and a framework to apply abstractions and 
derive effective and sound static analyzers. When shifting from proof methods to inference 
methods, the non-interference check naturally becomes an inference of interferences. Our 
work is thus strongly inspired from [15]: it is based on an Owicki-Gries style decomposition 
of the global invariant (although it is not only based on the control points of threads, but 
also on a more complex scheduler state). The thread-local and interference parts are then 
abstracted separately, using a relational flow-sensitive analysis for the former and a coarser 
non-relational flow-insensitive analysis for the later. Our work is also similar to the recent 
static analysis of C programs with POSIX threads by Carre and Hymans |llj : both are 
based on Abstract Interpretation and interference computation, and both are implemented 
by modifying existing static analyses of sequential programs. Their analysis is more powerful 
than ours in that it handles dynamic thread creation and the concrete semantics models 
interferences as relations (instead of actions), but the subsequent abstraction leads to a 
non-relational analysis; moreover, real-time scheduling is not considered. 

6.2. Data-Flo'w Analysis. Fully flow-insensitive analyses, such as Steensgaard's points- 
to analysis [58j, naturally handle parallel programs. To our knowledge, all such analyses 
are also non-relational. These fast analyses are adequate for compiler optimization but, 
unfortunately, the level of accuracy required to prove safety properties demands the use of 
(at least partially) flow-sensitive and relational methods, which we do. By contrast, Salcianu 
and Rinard [54j proposed a flow-sensitive pointer and escape analysis for parallel programs 
which is more precise (and more costly), although it still targets program optimisation. It 
uses a notion of interference to model the effect of threads and method calls. 

6.3. Model Checking. Model-checking also has a long history of verifying parallel sys- 
tems, including recently weak memory models [6]. To prevent state explosion, Godefroid |31j 
introduced partial order reduction methods. They limit the number of inter leavings to con- 
sider, with no impact on soundness nor completeness. Due to the emphasis on completeness, 
the remaining set of interleavings can still be high. By contrast, we abstract the problem 
sufficiently so that no interleaving need to be considered at all, at the cost of completeness. 
Another way to reduce the complexity of model checking is the context bound approach, as 
proposed by Qadeer et al. j51j . As it is unsound, it may fail to find some run-time errors. 
By contrast, our method takes into account all executions until completion. In his PhD, 
Malkis |42j used abstract interpretation to prove the equivalence of Owicki and Gries's proof 
method and the more recent model-checking algorithm by Flanagan and Qadeer [29j, and 
presented an improvement based on counterexample-guided abstract refinement, a method 
which, unlike ours, is not guaranteed to converge in finite time. 
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6.4. Weakly Consistent Memories. Weakly consistent memory models have been stud- 
ied originally for hardware — see [2] for a tutorial. Precise formal models are now available 
for popular architectures, such as the Intel x86 model by Sewell et al. [57], either inferred 
from informal processor documentations or reverse-engineered through "black-box" testing 
[5] . Pugh |50j pioneered the use of weakly consistent memory models in programming lan- 
guage semantics in order to take into account hardware and compiler optimizations. This 
culminated in the Java memory model of Manson et al. 031 [32]. 

Weakly consistent memory models are now recognised as an important part of language 
semantics and are increasingly supported by verification methods. An example in model- 
checking is the work of Atig et al. [6]. An example in theorem proving is the extension by 
Sevcik et al. [56] of Leroy's formally proved C compiler [Jl]. Testing methods have also 
been proposed, such as that of Alglave et al. [S]. In the realm of static analysis, we can cite 
the static analysis by Abstract Interpretation of the happens-before memory model (at the 
core of the Java model) designed by Ferrara [2S]. Recently, Alglave and al. proposed in [3j 
to lift analyses that are only sound with respect to sequential consistency, to analyses that 
are also sound in weak memory models. Their method is generic and uses a "repair loop" 
similar to our fixpoint of flow-insensitive interferences. 

Memory models are often defined implicitly, by restricting execution traces using global 
conditions, following the approach chosen by the Java memory model [32] . We chose instead 
a generative model based on local control path transformations, which is reminiscent of the 
approach by Saraswat et al. [55]. We believe that it matches more closely classic software 
and hardware optimizations. Note that we focus on models that are not only realistic, but 
also amenable to abstraction into an interference semantics. The first condition ensures the 
soundness of the static analysis, while the second one ensures its efficiency. 

6.5. Real-Time Scheduling. Many analyses of parallel programs assume arbitrary pre- 
emption, either implicitly at all program points (as in flow-insensitive analyses), or explicitly 
at specified program points (as in context-bounded approaches [51]), but few analyses model 
and exploit the strict scheduling policy of real-time schedulers. A notable exception is the 
work of Gamatie et al. [30j on the modeling of systems under an ARINC 653 operating 
system. As the considered systems are written in the SIGNAL language, their ARINC 
653 model is naturally also written in SIGNAL, while ours in written in C (extended with 
low-level primitives for parallelism, which were not necessary when modeling in SIGNAL 
as the language can naturally express parallelism). 

6.6. Further Comparison. A detailed comparison between domain-aware static analyz- 
ers, such as Astree, and other verification methods, such as theorem proving and model 
checking, is presented in [19]. These arguments are still valid in the context of a parallel 
program analysis and not repeated here. On the more specific topic of parallel program 
analysis, we refer the reader to the comprehensive survey by Rinard [ 53] . 

7. Conclusion 

We presented a static analysis by Abstract Interpretation to detect in a sound way run-time 
errors in embedded C software featuring several threads, a shared memory with weak con- 
sistency, mutual exclusion locks, thread priorities, and a real-time scheduler. Our method 
is based on a notion of interferences and a partitioning with respect to an abstraction of the 
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scheduler state. It can be implemented on top of existing analyzers for sequential programs, 
leveraging a growing library of abstract domains. Promising early experimental results on 
an industrial code demonstrate the scalability of our approach. 

A broad avenue for future work is to bridge the gap between the interleaving semantics 
and its incomplete abstraction using interferences. In particular, it seems important to 
abstract interferences due to well synchronized accesses in a relational way (this is in par- 
ticular needed to remove some alarms remaining in our target application). We also wish 
to add some support for abstractions that are (at least partially) sensitive to the history 
of thread inter leavings. This would be useful to exploit more properties of real-time sched- 
ulers, related for instance to the guaranteed ordering of some computations (by contrast, 
we focused in this article mainly on properties related to mutual exclusion). 

Moreover, we wish to extend our framework to include more models of parallel compu- 
tations. This includes support for alternate real-time operating systems with similar sched- 
uling policies but manipulating different synchronization objects, for instance the condition 
variables in real-time POSIX systems [34], or alternate priority schemes, such as the priority 
ceiling protocol for mutexes. Another example is the support for the OSEK/VDX and Au- 
tosar real-time embedded platforms widely used in the automotive industry. We also wish 
to study more closely weak memory consistency semantics and, in particular, how to design 
more precise or more general interference semantics, and abstract them efficiently. Sup- 
porting atomic variables, recently included in the C and C++ languages, may also trigger 
the need for a finer, field-sensitive handling of weak memory consistency. 

A long term goal is the analysis of other errors specifically related to parallelism, such 
as dead-locks, live-locks, and priority inversions. In a real-time system, all system calls 
generally have a timeout in order to respect hard deadlines. Thus, interesting properties 
are actually quantitative: by construction, unbounded priority inversions cannot occur, so, 
we wish to detect bounded priority inversions. 

On the practical side, we wish to improve our prototype analyzer to reduce the number 
of false alarms on our target industrial code. This requires some of the improvements 
to the parallel analysis framework proposed above (such as relational and fiow-sensitive 
abstractions for interferences), but also the design of new numerical, pointer, and memory 
domains which are not specific to parallel programs. 
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Appendix A. Proof of Theorems 



A.l. Proof of Theorem 2.1 , Vs G stat : S[[s]] is well defined and a complete U—morphism. 



Proof. We prove both properties at the same time by structural induction on s. 

• The case of assignments X ■<— e and guards e M 0? is straightforward. 

• The semantics of conditionals and sequences is well-defined as its components are well- 
defined by induction hypothesis. It is a complete U—morphism, by induction hypothesis 
and the fact that the composition and the join of complete U— morphisms are complete 
U— morphisms. 

• For a loop while e M do s, consider F and G defined as: 

G{X) = {R, ft) U F{X) . 

By induction hypothesis, F, and so G, are complete U— morphisms. Thus, G has a least 
fixpoint |14] . We note that: 

S[[ while e M do sl(i?, Jl) = S[[e ^ 0?j{lfpG) 
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which proves that the semantics of loops is weh-defined. Moreover, according to [H] again, 
the least fixpoint can be computed by countable Kleene iterations: Ifp G = \_\^^]^ G*(0, 0). 
We now prove by induction on i that G*(0, 0) = |Jfc<i -^'^(^i ^)- Indeed: 

Gi(0,0) 

= (/?, 17) U F(0, 0) 

= F'^{R,n) 

and 

G^+i(0,0) 

= G{Uk<.FHR,n)) 

= {R,n)uF{Uk<rFHR,^)) 

= {R,n)uUk<^F''+HRM) 

because F is a U— morphism. As a consequence, IfpG = \_\iehiF^iR,^)- Note that, 
Vi G N : i^* is a complete U— morphism. Thus, the function (R, Q) i— )• Ifp G is also a 
complete U— morphism, and so is S[[ while e cxi do sj. D 



A.2. Proof of Theorem [2^ P C ptt. 

Proof. We prove by structural induction on s that: 

ys.R^Vl: (S[slo7)(i2»,fl) C (7 o S«[sl)(i2^ J^) . 

• The case of primitive statements X -^ e and e ixi 0? holds by hypothesis: the primitive 
abstract functions provided by the abstract domain are assumed to be sound. 

• The case of sequences si ; S2 is settled by noting that the composition of sound abstractions 
is a sound abstraction. 

• The case of conditionals if e CO Othens is settled by noting additionally that u" is a 
sound abstraction of U, as U^- is a sound abstraction of U. 

• We now treat the case of loops. By defining: 

F»(X) = (fi«,rj)U«(S»[sloS«[[eMO?l)(X) 

F{X) = (i?,f))U(S[sloS[e^O?l)(X) 
we have 

S«[whileeMOdosK^*,S^) = S«[e i^ 0?K^«m AX» : X^ V FS(X»)) 
S[whileecxOdosK-R,^) = S[e ^ 0?l(//pF) . 

By induction hypothesis, soundness of u" and composition of sound functions, F* is a 
sound abstraction of F. Assume now that (i?"', O') is the limit of iterating \X : Xv F^{X) 
from (-L5.,0). Then, it is a fixpoint of XX^ : X^ V F^{Xi), hence {R^',n') = {R^',n') V 
F*{R^', i7'). Applying 7 and the soundness of V and F", we get: 

7(i?S',r2') 
= 7((i?«',0')vFB(i?tt',f)')) 
3 7(F«(i2»',f7')) 
3 F(7(i?tt',0')) • 
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Thus, ^{R^'jQ') is a post-fixpoint of F. As a consequence, it over-approximates IfpF, 
which imphes the soundness of the semantics of loops. 

We now apply the property we proved to s = body, R^ = £q and J7 = 0, and use the 
monotony of S[[s]] and the fact that j{£q) 5 So to get: 

^{S^bodyJiSlM 
3 SI body UjiSlH})) 
3 SI body Ji£o,9) 

which implies P" ^ P. D 



A.3. Proof of Theorem [2^ Vs G stai : n[[7r(s) ]] = SJs]]. 

Proof. The proof is by structural induction on s. 

• The case of primitive statements is straightforward as 7r(s) = { s }. 

• For sequences, we use the induction hypothesis and the fact that PI is a morphism for 
path concatenation to get: 

Slsi;s2} 

= n[7r(s2)loP[7r(si)l 
= n[7r(si)-7r(s2)l 
= ^Msi;s2)j . 

• For conditionals, we also use the fact that P is a morphism for U: 

S[[if ecx]0thens]l(i?,f]) 

= (S[[sloS[[eM0?l)(i?,(7)uS[ei?^0?K^,^) 

= (P[[ 7r(s) 1 o P[ { e ^ 0? } j){R, f]) U P[ { e i^ 0? } l(i2, Jl) 

= P[((e«0?)-7r(s))U{e^0?}K^,^) 
= P[[7r(if eCxcOthens)K^,^) • 



For loops while e cxi do s, we define F and G as in proof A.l, i.e., F{X) = {Sis} o 



S[eMO?l)X andG(X) = {R,n)UF{X). Recall then that IfpG = \Jie^F^{R,n). By 
induction hypothesis and -—morphism, we have: 

pi 

= (S[.loS[eMO?ir 
= (PHeM0?}-7r(s)l)* 
= P[({eM0?}-7r(s))i . 

Let us now define the set of paths P = Ifp XX : {e} U (X • {e M 0?} • 7r(s)). By jl 
P = [Ji^\s^i{e ixi 0?} • 7r(s))*. As a consequence: 

^Pj = [_\m{e^O?} ■ 7r{s)yj = IJF' 
and n[Pl(i?,0) = UeN^'(^>^) = VpG. 
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Finally: 



which concludes the proof. 



S[[whileeixiOdosK^,^) 
n 1 7r(while e Od do s) ]] {R, Q) 



n 



A.4. Proof of Theorem 3.1, Vt eT,s e stat : nx[[7r(s), tj = Sx[[s, tj. 



Proof. The proof A. 3 of Thm. 2.3 only relies on the fact that the semantic functions S[[s]] 
are complete U— morphisms. As the functions Sx[[s,tl| are complete Uj— morphisms, the 
same proof holds. D 



A.5. Proof of Theorem [3^ P* C Pj. 



Proof. To ease the proof, we will use the notations R{X), ^(X), I{X) to denote the various 
components of a triplet X = (i?(X), J7(X),/(X)) in the concrete domain Vx = V{8) x 
V{C) X V{Z), and V{X), ^{X) for pairs X = {V{X),VL{X)) in V{R) x V{C) output by the 
semantics of expressions. 

Let {^Xilx) be the fixpoint computed in (3.4), i.e., {0,x,lx) 



We first prove the 



{-,n[,I't) = Sx[ bodyt,tj{£o,nx,Ix)- Then, by definition, Px = ^x- 

following properties that compare respectively the set of errors and environments output 

by the interleaving semantics and the interference semantics of any path p £ tt^,: 

(i) f^(n4pK^o,0)) eUer ^{mprojt{p),tj{£o,(D,Ix)) 

(ii) Vt G T,p€R{U4p}{£o,ill)) : 3p' G RiM proj^{p), t\{£^,%Jx)) ■ 
VX G V : {p{X) = p'{X) V3t' ^t: {t',X,p{X)) G h) 

The proof is by induction on the length of p. The case p = e is straightforward: the il 
component is on both sides of (i), and we can take p' = p in (ii) as the R component is 
£q on both sides. Consider now p = p' ■ (s', t'), i.e., p is a path p' followed by an assignment 
or guard s' from thread t' . Consider some p G R{f\4p'}{£o,^)) and the expression e' 
appearing in s' . We can apply the (ii) recurrence hypothesis to p' and t' which returns some 
p' e R{^xl proJt,{p'), t'j{£o,il),Ix)). The fact that, for any X G V, either p{X) = p'{X) or 
3t" / t' : {t",X,p{X)) G Ix implies, given the definition of Ex[[e']l (Fig. [7]), that: 

(iii) E[e'lpCEx[e'l(t',pMx) 

When considering, in particular, the error component r2(E[[e']]p), (iii) allows extending the 
recurrence hypothesis (i) on p' to also hold on p, i.e., executing s' adds more errors on the 
right-hand side of (i) than on the left-hand side. 

To prove (ii) on p, we consider two cases, depending on the kind of statement s': 
• Assume that s' is a guard, say e' = 0? (the proof is identical for other guards e' ixi 0?). 
Take any i G T and p G R{U4pJ{£q,9)). By definition of S[[e' = 0?]], we have p G 
i?(n*[[p']]((?O)0)) and G y(E[[e']p). Take any p' that satisfies the recurrence hypothesis 
(ii) on p' for t and p. We prove that it also satisfies (ii) on p for t and p. 
The case t ^ t' is straightforward as, in this case, projf-{p) = proj^{p'). 
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When t = t', then proj^{p) = projf{p') ■ (e' = 0?). Recall property (hi): E[[e']]/9 1^ 
^xle'j{t',p',Ix), which implies that G V{Exle'j{t',p',Ix)), and so, we have p' G 
RimproJtip),t'U£o,fli,h))- 

Assume that s' is an assignment X ■<— e'. Take any p £ i?(in*[[p]](i5o,0)) and t £ T. Then 
p has the form po[X ^ p{X)] for some po G Ri^^p'JiSo,'^)) and p{X) G F(E[e']]po)- 

We first prove that {t',X,p{X)) G Ix- Take p'q as defined by the recurrence hypothesis 
(ii)fort',p'andpo. Then p(X) G y(E[e'lpo) C y(Ei[e'K*',Po,^x)) by property (iii). By 
definition of Sx I X ^ e', t'} (Fig.|8|, we have {t\X,p{X)) G I(Sx\X ^ e\ t'}{{p'Q],^,Ix)). 
We have p'q G i?(Elx[[ projf./(j)'), t'}{£o, 0, Ix)) by definition. Because proj^i^p) = proj^,{p')- 
{X ^ e') G Tribodyt,), it follows that {t',X,p{X)) G /(nx[7r(&od?/i/), t'j{£o,9,h)). 
By Thm. [STJ we then have {t',X,p{X)) G /(Sx[[ bodyt,, t'j{£o,9,Ix)) = I't>- Thus, 



Uer^*- 



{t' ,X, p{X)) G Ix, as Ij satisfies Ix 

To prove (ii), we consider first the case where t ^ t' . Take pg as defined by the 
recurrence hypothesis (ii) for t, p' and po- As t ^ t' , proj)-{p) = proj^ip'), and Pq G 
i?(Elx[[ proj^ip), t\{£Q,%,Ix))- As p and po are equal except maybe on X, and we have 
(t',X',p(X)) G /x> then p'q satisfies (ii) for t, p, and p. 

We now consider the case where t = t' . Take pg as defined by the recurrence hy- 
pothesis (ii) for t, p' and po- We define p' = Pg[A" i— 5- piX)]. The property (iii) im- 
plies V{Ele'jpo) C y(Ex[e'l(t',p'o, Ix))- We get p' G i?(nxl proi,(p') • (^ ^ e'), *! 
(£^0)0) Ii)) = R{^xl proj f{p) , t}{£o,9,Ix))- As p and po are equal except maybe on 
X, and p' and pg are also equal except maybe on X, and on X we have p(X) = p'{X), 
then p' satisfies (ii) for t, p and p. 



The theorem then stems from applying property (i) to all p G vr* and using Thm. 3.1 



= n{KlTT,j{£o,^)) 

= Upe., ^(.^4pWom 

^ UteT,pe.. ^(M Projtip), il(fo,0,lx)) 

= [JteT ^iMAbodyt),tj{£o,i/}Jx)) 

= [JteT^i^Tlbodyt,tj{£o,9,Ix)) 

C Uter ^(Sxl bodyt, tj{£o,nxJx)) 

= nx 

= Px ■ 



D 



A. 6. Proof of Theorem 



3.3, Px C P|. 



Proof. We start by considering the semantics of expressions. We first note that, for any 
abstract environment w G £\ abstract interference /" G X", thread t, and expression e, 
if p G 7£:(i?^), then Ex[e]](t, p, 7x(/'')) C E[[ apply {t, R%P,e)}p, i.e., appZy can be used to 
over-approximate the semantics with interference Ex using the non-interfering semantics E 
in the concrete. We prove this by structural induction on the syntax of expressions e. We 
consider first the case of variables e = X £ V, which is the interesting case. When 7£:(i?'') 
has no interference for X, then apply{t, W", P,X) = X and: 

E[ applyit, rKi\X)\p = ¥.IX\p = {p{X)} = Ex[ X K*, P, lx{I^)) ■ 
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In case of interferences on X, we have apply {t,R^,l'^,X) = as-expr{Vjr U^ get{X,R'^)), 
which is an interval [I, h] containing, by definition of as-expr and soundness of get, both p{X) 
and{v\3t' ^t: {t',X,v) G7x(/»)}. Thus, E[ apply{t, Ri,I^,X)}p = [l,h] C E[Xjp. For 
other expressions, we note that Ej and E are defined in similar way, and so, the proof stems 
from the induction hypothesis and the monotony of Ej and E. 

To prove the soundness of primitive statements S^[[ s ]] , we combine the above result with 
the soundness of S" with respect to S. We immediately get that i?((Si[[ sj o7)(i?f , fi, j")) C 
i?((7oSj[[s]])(-R'', ri, I*)) for an assignment or test s, and likewise for the Q component. (We 



reuse the notations R{X), Q(X), I{X) from the proof A. 5) The / component is unchanged 
by Sx[[ sj and Sj[[ sj when s is a test. For the I component after an assignment, we remark 
that liiSxlX ^ ejo-/){R^,n,I)) can be written as 

7x(/) U { {t,X,piX)) I p e R{{SxlX ^ el o7)(i?fl,0,I)) } . 

We then reuse the fact that R{{Sxlsjo'y){Ri, n, /»)) C R{{-foS\-l sj){R^, Q, /")) to conclude 
the proof of primitive statements. 

The case of non-primitive statements is easier as it is mostly unchanged between S and 



Sj (hence, between S' and Sj). Hence, the proof in A. 2 of Thm. 2.2 still applies. 
As a consequence, we have: 

yteT:{Sxlbodyt,tjo-f){Ri,n,I^)rx{-foS^-^lbodyt,tj){RKn,I^) . 



Consider now the solutions {^x,Ix) and (fii, /i) of the fixpoints (3.4) and (3.6). We 



have: 

{nx,ix) = ifpF 



where F{n, I) = Uter i^'t^ ^t) 

and {-,n[,Ii) = Sxl bodyt,tj{£o,n,I) 



Likewise: 



(fiB., /|) = lim Ffl 

where F«(0«, /«) = (f^», /«) V ULr (^t'' ^t') 
and {-,nl',ll') = S[[ body,, t}{£lnKl^) 

defining (fi« , /«) U» (f7« , /») = {n{ U fi|, /f uj- 4) 

and {n{ ,il)v{nl4) = {nl , i{ vx 4) 

By soundness of Uj, Vj, Sj, and £q, we get that F^ is a sound abstraction of F. The same 
fixpoint transfer property that was used for loops in proof [A?2 can be used here to prove 



that lim F^ is a sound abstraction of Ifp F. As a consequence, we have Oj 5 Qx, and so, 

P{ 5 Px. D 



A.7. Proof of Theorem [sTsj P'^ C Pj. 



Proof. We reuse the notations -R(^), ^{X), I{X), V{X) from proof A. 5 Consider a path 
p from a thread t that gives, under an elementary transformation from Def. 3.4 a path p' . 
Let us denote by V/ the subset of fresh variables (i..e, that do not appear anywhere in the 
program, except maybe in p'), and by Vi{t) the subset of variables that are local to t (i.e., 
that do not appear anywhere, except maybe in thread t). 
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We consider triples {R, Q, I) such that the interferences / are consistent with the fresh 
and local variables, i.e., if {t,X,v) E /, then X ^ Vj and X G Vi{t') =^ t = t' . We prove 
that, for such triples, the following holds: 

(i) n{Mp',tm,^,i)) ^ n{Uxip,t}{R,n,i)) 

(ii) y{t',X,v)eI{Uxlp',tj{R,n,I)): 

it',X,v) G Ii^xlp,tj{R,n,I)) or (t = f AX G V/(t)) 

(iii) Vp' G R{Uxlp',tUR,n,I)) : 3p G R{Uxlp,t}{R,n,I)) : 
VX G V : p{X) = p'{X) or X G V/ 

i.e., the transformation does not add any error (i), it can only add interferences on local 
variables (ii), and change environments on fresh variables (iii). We now prove (i)-(iii) for 
each case in Def. 13. 4[ 

(1) Redundant store elimination: X -^ ei • X -^ 62 -^ X ^ 62, where X ^ var{e2) and 
nonblock{ei). 

We note: 

- {Ri,ni,Ii) = ^xlX ^ ei, tj{R,n,I) for i = 1,2, and 

- (i?l;2,^l;2,/l;2) = M^ ^ 61 • X ^ 62, tjiR,^,!). 

As X ^ var{e2), ^xl^2}{t, p,I) = Ei[[e2l|(i,p[X ^ v],I) for any p and v. Moreover, as 
nonblock{ei): 

Vp G i? : 3v G ViExleiJit,p,I)) : p[X ^ v] e Ri . 

This implies Ri-2 = R2, and so, (iii). Moreover, il.i-2 = ^i D Q2 5 ^2, and so, (i). 
Likewise, Ii-2 = h^ h ^ h, and so, (ii). 

Note that the hypothesis nonblock{ei) is important. Otherwise we could allow X ■<— 
1/^ • X •(— l/^/ -^ X •(— l/£/0, where the error i' is in the transformed program but 
not in the original one (here, il.i;2 = ^1 2 ^2)- 

(2) Identity store elimination X ^ X -^ e. 
We have: 

Ji(%[x^x, tj{R,n,i)) 

Iil]xlX^X,tj{R,^,I)) 

R{MX^X,tj{R,n,I)) 
In the two last inequalities, the converse inequality does not necessarily hold. In- 
deed, X •<— X creates interferences that may be observed by other thread. Moreover 
y(Ex[[X]](t, />,/)) is not necessarily {p{X) }, but may contain interferences from other 
threads. This shows in particular that the converse transformation, identity insertion, 
is not acceptable as it introduces interferences. 

(3) Reordering assignments: Xi -^ ei-X2 -^62 "^ X2 -^ e2-Xi -^ ei, where Xi ^ var{e2), 
X2 ^ uar(ei), Xi ^ X2, and nonblock{ei). 

We note: 

- iRi,i}i,Ii) = HxlXi ^ a, tJiR,i},I) for i = 1,2, and 

- (i?i;2,^i;2,/i;2) = M^i ^ ci ■ X2 ^ 62, tJiR,^,!), and 

- (i22;l,fi2;l,/2;l) = MX2 ^ 63 • Xi ^ ei,tK^,^,^)- 

As X2 ^ uar(ei), Xi ^ var{e2), and Xi 7^ X2, we have 

Vp, « : ExI ei 1 (t, p, /) = Ex[ ei 1 (t, p[X2 ^ «] , I) and 
Vp, r; : ExI 62 1 (t, p, /) = Ex[ 62 1 (t, p[Xi ^ «] , /) . 



= 





D 


/ 


D 


i? 
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This implies Ri-2 = R2;i, and so, (iii). 

As nonblock{ei), we have \/p G R : 3v : p[Xi i— )■ w] G Ri. This imphes 0,2 = 
0(ini[[X2 <— 62, tll(^i)^)-^))) and so 17i;2 = 0.iLl02- Likewise, Ii-2 = h^h- Moreover, 
^2;i C r^i U ^^2 = rii;2 and /2;i C /i U /2 = /i;2- Thus (i) and (ii) hold. 

As in (1), the nonblock{ei) hypothesis is important so that errors in 62 masked by 
Xi ^ e\ in the original program do not appear in the transformed program. 

(4) Reordering guards: ei 00 0? • 62 oo' 0? ^^ 62 ixi' 0? • ei ixi 0?, where noerror{e2)- 

We use the same notations as in the preceding proof. We have l2-i = Ii-2 = I, which 
proves (ii). Consider p ^ R, then either p G i?ini?2i in which case p G Ri-2 and p G R2;i, 
or p ^ i?i n i?2) in which case p ^ Ri-2 and p ^ i?2;i- In both cases, Ri-2 = R2;i, which 
proves (iii). We have 1^2 ^ ^2;i Q iliUr22 and 0,i C i7]^.2 C iliUil2- But, as noerror(e2), 
0,2 = 0; which implies i72;i ^ ^1 ^ ^i;2) and so, (i). 

The property noerror{e2) is important. Consider otherwise the case where ei ixi 0? 
filters out an environment leading to an error in 62. The error would appear in the 
transformed program but not in the original one. 

(5) Reordering guards before assignments: Xi •(— ei • 62 ex 0? ^-> 62 ixi 0? • Xi ^ ei, where 
Xi ^ var{e2) and either nonhlock{ei) or noerror{e2)- 

We use the same notations as in the preceding proofs. As Xi ^ war (62), we have 

Vp, t; : ExI 62 1 (t, p, /) = Ex[ 62 1 {t, p[Xi ^v],I) . 

This implies Ri-2 = R2;i, and so, (iii). 

Moreover, we have Ii-2 = Ii and l2-i C J^, thus (ii) holds. 

For (i), consider first the case nonblock{ei). We have \/p £ R : 3v : p[Xi 1— )■ f ] G i?i. 
This implies ^2 = ri(nj[[e2 ex 0?, tj{Ri,n,I)), and so i7i;2 = J^i U ri2- As SI2 ^ 
^2;i ^ ^1 U $72, (i) holds. Consider now the case noerror{e2)- We have i7i;2 = ^1 and 
f^2;i ^ ^1) and so, (i) holds. 

(6) Reordering assignments before guards: ei M 0? • X2 -^ 62 -^ X2 -^ 62 • ei ixi 0?, where 
X2 ^ var{ei), X2 G V/(i), and noerror{e2)- 

As X2 ^ uar(ei), we have 

V/>,7;:Ex[eil(t,p,/) = Ei[eil(t,p[X2^t;],/) 

and thus, using the same notations as before, i?i;2 = R2\i-, and so, (iii). 

Because, noerror{e2), we have il2-i ^ r2iUr22 = f^i and $7i-2 = ili, and so $72i ^ f^i-2 

(i). 

Unlike the preceding proofs, we now have Ii-2 ^ -^2;i = I2 and generally /i;2 2 ^2;i- 
However, as X2 G Vi{t), we have /2;i \ h]2 ^ {i} x V/(t) x R, i.e., the only interferences 
added by the transformation concern the variable X2, local to t. This is sufficient to 
ensure (ii). 

(7) Assignment propagation: X -^^ e ■ s -^ X ■^ e ■ s[e/ X], where X ^ var{e), var{e) C 
Vi{t), and e is deterministic. 

Let us note : 

- iRi,ni,h) = ^xix ^ e, tjiR,n,i), 

- {Ri;2,ni,2,h;2) = Ms, t}{R,,ni,h), 

- {R[,2,n'i') = m4e/x], tj{Ri,ni,h). 
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Take pi € Ri, then there exists p £ R such that pi = p[X i— t- pi{X)] and pi{X) £ 
V{ExieJit,p,I)). As e is deterministic and X ^ var{e), V{ElXjpi) = Vmejp) = 
{pi{X)}. Additionally, as var{e) C Vi{t), there is no interference in / for variables in e 
from other threads, and so, V{ExlXj{t, pi, h)) = y(Ej[[e]](t,p, /)) = {piiX)}. As a 
consequence, Ri-2 = -Ri-2 5 which proves (iii). Moreover, Ii-2 = -^1-2 > hence (ii) holds. Fi- 
nally, we have U { ^(ExteK*, P, -?')) \ P^ R} ^ ill but also U{f^(Ex[[e]l(t,pi,Ii)) | Pi G 
i?i } C ill, i-e., any error from the evaluation of e while executing s[e/X] in Ri was al- 
ready present during the evaluation of e while executing X -^^ e in R. Thus, ili;2 = ^1-2 > 
and (i) holds. 

The fact that e is deterministic is important. Otherwise, consider X -^ [0,1] -Y -^r- 
X + X -w X ^ [0, 1] • y ^ [0, 1] + [0, 1]. Then Y £ {0, 2} on the left of -w while 
Y £ {0, 1,2} on the right of -^ : the transformed program has more behaviors than 
the original one. Likewise, var{e) C Vi{t) ensures that e may not evaluate to a different 
value due to interferences from other threads. 

(8) Sub-expression elimination: si • . . . • s„ -^ X -^ e • si[X/e] • . . . • Sn[X/e], where X £ Vf, 
var{e) Ci lval{si) = 0, and noerror{e). 

Let us note: 

iR',n',r) = ^xlx ^e,tjiR,n,i) . 

Consider/)' £ R' . Then p' = p[X ^ p'{X)] ior some p £ R and p' (X) £ V{Exlej{t, p, I)). 
As X does not appear in Sj (being fresh), and noting: 

{R!,MJi) = m-'^iiX/e] • ... -..[X/e], tJH p' },n' ,l') 

we get: 

yi,p',£R!,:ViElXjp[) = {p'iX)} 

and, as X £ Vf, there is no interference on X, and 

yp[ £ R!, : ViExlXJitJ.Jl)) = {p'iX)} . 

As var{e) n lval{si) = 0, and noting 

{Ri,ni,Ii) = l\xlsi-...-Si,t\{{p},n,I) 

we get: 

Vi, pi£Ri: V{Exle}{t,p,,h)) = V{Exle}{t,p,I)) D {p'{X)] . 

As a consequence, Vi, p[ £ R[ : 3pi £ Ri such that p'^ = pi[X \-^ p'{X)]. When i = n, 
this implies (iii). Another consequence is that Vi : I^' C /j u { {t,X, p'(X)) \ p' £ R'}. 
As X G Vf, this implies (ii) when i = n. Moreover, as noerror{e), Q' = fl. Note that: 

Vi : ni^six/e], tjiR'^nlil)) C n{Exls, tjiR!„n'„il)) . 

Thus, \/i : Q'- C rii, which implies (i) when i = n. 

(9) Expression simplification: s -w s[e' /e], when \/p : Ejejp 3 Eje'Jp and var{e) U 
mr(e') C Vi{t). 

As var(e) C Vi{t), there is no interference in / for variables in e from other threads. 
We deduce that Exlejit,p,l) = Elejp. Likewise, Ex[[e']](t, /?,/) = ^le'jp, and so, 
Exlejit,p,l) 3 Exle'j{t,p,I). By monotony of Sx, we get: 

Exls, tUR,n,I) 3 M4e/e], tUR,n,I) 

which implies (i)-(iii). 
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We now prove that, if the pair p,p' satisfies (i)-(iii), then so does the pair p- s, p' ■ s for any 
primitive statement s executed by t. We consider a triple (Rq, Qq, Iq) and note {R' , i^' , I') = 
^xlp',tj{Ro,no,Io) and {R,n,I) = ^xlp,tj{Ro,^o,Io)- Take any p' € Rl . By (iii) on the 
pair p,p\ there is some p € R that equals p' except maybe for some variables that are free, 
and so, cannot occur in s. Moreover, by (ii) on p,p', I contains /' except maybe for some 
interferences that are from t, and so, cannot influence the expression in s. So, by noting e 
the expression appearing in s, Ej[[e]](t, p', /') Qx Ex[[e]](t, p, /). As a consequence, (i) and 
(ii) hold for the pair p-s, p' -s. Consider now p' G R{r\xlp' • s, t}{RQ, Qq, Iq)). If s is a guard 
e = 0? (the other guards e ixi 0? being similar), then p' £ R' and, by (iii) for p,p' , 3p £ R 
equal to p' except for some free variables. As, G V{Exle}{t, p' ,1')) C V{Exle}{t, p,I)), p 
proves the property (iii) for p- s, p' -s. If s is an assignment X -^ e then p' = Pq[X i— )• p'{X)] 
for some Pq S R' and p'{X) G V{'£x\e\{t^ p'q^I')). By (iii) for p,p' , Bpo G R equal to p'q 
except for some free variables. So, p'{X) G y(Ex[e]l(t, /o'q,/')) ^ V{Ex\e}{t, p^J)). Thus, 
Pq[X I— )• p'{X)\ proves the property (iii) for p ■ s, p' ■ s. 

The following two properties are much easier to prove. Firstly, if the pair p, p' satisfies 
(i)-(iii), so does the pair q • p, q ■ p' for any path q. This holds because (i)-(iii) are stated 
for all R, O and /. Secondly, if both pairs p,p' and p' ,p" satisfy (i)-(iii), then so does the 
pair p,p" . This completes the proof that elementary path transformations can be applied 
in a context with an arbitrary prefix and suffix, and that several transformations can be 
applied sequentially. 

We are now ready to prove the theorem. Consider the least fixpoint computed by the 
interference semantics (|3.4|): 



{nx,ix) = ifpF 

where F(fi,/) = (U^er ^t, [JteT ^t) 
and (-,a,/t) = Sx[ hody^, ti{SQ,VL,I) 



By Thm. 3.1 we have {—,Qt,It) = TIxIt^ {body i), t^{£o,Q,,I). Given transformed threads 
7r'(t), consider also the fixpoint: 

{n'^,i!,) = ifpF' 

where F'{n',I') = (U^r ^t, Uer ^t \^0 
and {-,n't,n) = m^'it), t}{£o,n',I') 
and Ii is a set of interferences we can ignore as they only affect local or fresh variables: 

Ii = {{t,X,v)\teT,veR,X eVfU Vi{t) } . 

Then, given each path in p' G vr'(t), we can apply properties (i) and (ii) to the pair p, p' , 
where p is the path in 'iT{body^) that gives p' after transformation. We get that F'{X) Qx 
F{X). As a consequence, Ifp F' Qx IfpF. The transformed semantics, however, is not 
exactly Ifp F' , but rather: 

where F"{n',I') = (U^r ^uUteT ^t) 
and ($7^,/^) defined as before 

The difference lies in the extra interferences generated by the transformed program, which 
are all in X;. Such interferences, however, have no influence on the semantics of threads, as 
we have: 
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Indeed, any interference {t',X,v) G Ii is either from thread t and then ignored for the 
thread t itself, or it is from another thread t' ^ t in which case X is a local variable of t or 
a free variable, which does not occur in t'. As a consequence, /^ = If and fij' = i}[, and 
so, 0" = O' C fi. Hence, the interference semantics of the original program contains all the 
errors that can occur in any program obtained by acceptable thread transformations. D 



A.8. Proof of Theorem 4.1, P^ C P^ and P;;^ C P^. 



Proof. We first prove P-^ C Pq. 

In order to do so, we need to consider a path-based interference semantics ITlcjP, ij 
that, given a set of paths P C nobody i) in a thread t, computes: 

UclP,t}{R,n,I) "=' \Jc{(.^dsn,t}o...oSclsi,t}){R,nj)\si-...-Sn€P} . 



Similarly to Thms. |2.3[ |3.1[ the two semantics are equal: 

yt£T,s£ Stat : nc[[7r(s), tj = Sc[[s, tj 



The proof is identical to that in A. 3 as the Sc functions are complete U— morphisms, and 
so, we do not repeat it here. 



The rest of the proof that P-^ C P^ follows a similar structure as A. 5 Let (ilc) Ic) be 
the fixpoint computed in (4.10[), i.e., 



inc,ic) = Uc{i^un)\t€T} 

where (-,0'j,/j') = Scl bodyt,tj{{co} x £o,^c,Ic) ■ 

We denote initial states for P-^ and Pq as respectively £oh = {ho} x£o and Sqc = {cq} xSq. 
Furthermore, we link scheduler states h ^ % and partitioning configurations c G C in a 
thread t with the following abstraction at : 7i ^ V{C): 

at{b, /) = { {l{t),u, weak) | Vx G n, f' G T : x ^ l{t') } . 

i.e., a configuration forgets the ready state b{t) of the thread {ready, yielding, or waiting 
for some mutex), but remembers the exact set of mutexes l{t) held by the current thread, 
and optionally remembers the mutexes not held by any thread u. We prove the following 
properties by induction on the length of the path p G vr* : 

(i) ni^nlpWohJ)) c Uer ^i^d projtip), tU£oc,i/},Ic)) 

(ii) yt€T:y{h,p) G RiUnlpWoh,^)) : ^{c, p') G R{Ucl proj t{p) , tJiSoc^Jc)) ■ 

c G at{h)/\ 

VX G V : p{X) = p'{X) or 3t',d -.t^t', intf{c,c'), {t' ,c' ,X, piX)) G Ic- 

The properties hold for p = e as Vt G T : 

^nMiSoh,^) = (Soh,^) 

nc[e,tK^Oc,0,/c) = (foc,0,/c) 

and Co G at{ho). 

Assume that the properties hold for p' and consider p = p' ■ (s',t'), i.e., p' followed 
by a primitive statement s' executed by a thread t' . The case where s' is an assign- 



ment or a guard is very similar to that of proof A. 5: we took care to update (ii) to re 



fleet the change in the evaluation of variables in expressions Ec[[X|| (in particular, the 
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use of intf to determine which interferences from other threads can influence the cur- 
rent one, given their respective configuration). The effect of enabledt in S-^[[s',t']] is 
to remove states from R{f\-}{lp'}{£Qh,^)), and so, it does not invahdate (ii). Moreover, 
as assignments and guards do not modify the scheduler state, the subsequent sched ap- 
plication has no effect. Consider now the case where s' is a synchronization primitive. 
Then (i) holds as these primitives do not modify the error set. We now prove (ii). Given 
{h,p) G R{f\-}{lp}{£Qh,9)), there is a corresponding {hi, pi) G R{l\'^\p' \{£Qh,%)) such that 
{h,p) G i?(S^[[s',t']]({(/ii,pi)},0)). Given t gT, we apply (ii) on {hi, pi) and p', and get 
a state {ci,p'i) G -R(inc[[ projf{p'), t}{£Qc,^, Ic)) with ci G at{hi). We will note {li,ui,—) 
the components of ci . We construct a state in C x £ satisfying (ii) for p. We first study the 
case where t = t' and consider several sub-cases depending on s': 

• Case s' = yield. We have pi = p and at{hi) = at{h). We choose c = {li, 0, weak). Then 
{c,p'i) G R{Scls', tj{{{ci,p'i)},n,Ic)). Moreover, c G at{h). We also have, Vc' G C : 
intf{ci,c') =^ intf{c,c'). Hence, VX G V : either p{X) = p[{X), or p'i{X) comes from 
some weakly consistent interference not in exclusion with ci, and so, not in exclusion with 
c. As a consequence, {c, p'l) satisfies (ii) for p' . 

• Case s' = lock(m). We choose c = (/i U {m} , (}> , weak) . This ensures as before that 
c G at{h). Moreover, pi = p. We now construct p' such that 

{c,p') € R{Scls',tj{{{ci,p'i)},n,Ic)) 

and 

VX G V : p{X) = p'{X) or 3t", c' : i' / *", intf{c, c'), (t", c', X, p{X)) G Ic ■ 

— If p'^{X) = pi{X), then we take p'{X) = p[{X) = pi{X) = p{X). 

— Otherwise, we know that p{X) = pi{X) comes from a weakly consistent interference 
compatible with ci: 3t",c' : t' / t" , intf{ci,c'), {t" ,c' ,X, pi{X)) G Ic- If intf{c,c') 
as well, then the same weakly consistent interference is compatible with c and can be 
applied to p' . We can thus set p'{X) = p'i{X). 

— Otherwise, as intf{c, c') does not hold, then either m G /' or tti G n', where (/', u' , —) = 
d. 

Assume that m G /', i.e., pi{X) was written to X by thread t" while holding the 
mutex m. Because i2(in-^[[p]](<So/i) 0)) 7^ 0) the mutex m is unlocked before t' executes 
s' = lock(m), so, t" executes unlock(?TT,) at some point in an environment mapping 
X to pi{X). Note that Sc[[ unlock (m), t"]] calls out to convert the weakly consis- 
tent interference {t" , d , X , pi{X)) G /c to a well synchronized interference {t",{l' \ 
{m},u' , sync{m)),X, pi{X)) G Ic- This interference is then imported by Scpock(m), t'] 
through in. Thus: 

{c,p'i[X^pi{X)]) G R{nclproJt{p), tj{£o„$,Ic)) 

and we can set p'{X) = pi{X) = p{X). 

The case where m G u' is similar, except that the weakly consistent interference is con- 
verted to a well synchronized one by a statement Sc \ yield, t" ]] or Sc [[ unlock(r7T,') , t" \ 
for an arbitrary mutex m! , in an environment where X maps to pi{X). 

In all three cases, {c, p') satisfies (ii) for p' . 
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• Case s' = unlock(?7i). We have pi = p. We choose c = {li\ {m},ui, weak), which imphes 
c G Oit{h). Moreover, as in the case of yield, Vc' : intf{ci,c') =^ intf{c,c'). Similarly, 
{c, p'l) satisfies (ii) for p' . 

• Case s' = X ^ islocked(m). We have VF / X : pi(Y) = p(Y) and p{X) G {0,1}. 
When X ^ islocked(m) is interpreted as X ■<— [0,1], the result is straightforward: we 
set c = ci and p' = p'i[X i— )■ p{X)]. Otherwise, ii p{X) = 0, we set c = {li,uiL){m}, weak) 
and, if p{X) = 1, we set c = {li,ui \ {m},weak), so that c G at{h). Moreover, when 
p{X) = 0, then p' is constructed as in the case of lock(?7i), except that p'{X) is set to 0. 
Likewise, the case p{X) = 1 is similar to that of yield and unlock(?7i), except that we 
also set p' = p'i[X i— ;■ 1]. In all cases (c, p') satisfies (ii) for p' . 

We now study the case t / t' , which implies proj^{p) = projf-{p'). We prove that, in each 
case, {ci,Pi) satisfies (ii) for p: 

• Case s' = yield. We have pi = p and at{hi) = at{h). 

• Case s' = lock(77T,). We have pi = p. In order ensure that ci G at{h), we need to 
ensure that m ^ ui. We note that, by definition of Sc, a configuration with m G ui 
can only be reached if the control path pToj^{p') executed by the thread t contains some 
X ^ islocked(m) statement not followed by any lock(m) nor yield statement, and no 
thread t" > t locks m. We deduce that t' < t. Hence, the interleaved control path p' 
preempts the thread t after a non-blocking operation to schedule a lower-priority thread 
t' . This is forbidden by the enabled function: we have R^enabledf {flnlp' }i£oh,9))) = 0, 
henceRif]nlpWoh,m=9- 

• Case s' = unlock(m). We have pi = p and at{hi) C at{h). 

• Case s' = X -1^ islocked(?7i). We have at{hi) = at{h). Moreover, Vy / X : piiY) = 
p{Y) and p{X) G {0,1}. To prove that (ii) holds, it is sufficient to prove that 3t",c' : 
t 7^ t" , {t" , c' , X , p{X)) G Ic and intf{ci,c'), so that the value of p{X) can be seen 
as an interference from some t" in t. We choose t" = t' . Moreover, as c', we choose 
the configuration obtained by applying the recurrence hypothesis (ii) to p' on {hi,pi), 
but t' instead of t. We deduce then by, definition of SclX -(^ islocked(m), t'} on the 
configuration c', that there exist interferences {t',c',X,0), {t',c',X, 1) G Ic- 

This ends the proof that P-^ C Pq. 



We now prove that P^ C P^. The proof is similar to A. 7 Given an original path p 
and the transformed path p' of a thread t, given any (R, Q, I) such that / is consistent with 
fresh and local variables, we prove: 

(i) n{Uclp',t}{R,nj)) c n{Uclp,t}{R,n,i)) 

(ii) y{t',c,X,v) e I{Uclp',tj{R,n,I)) : 

{t',c,X,v) £l{^clP,tj{R,n,I)) ovt = t' AX £Vi{t) 

(iii) V(c,p') G R{Uclp',q{R,n,I)) ■.3pe£:{c,p)€ R{Uclp,t}{R,n,I)), 
VX G V : piX) = p'{X) or X G V/ 

Note, in particular, that in (ii) and (iii), the configuration c is the same for the original 
path p and the transformed path p' . 



We ffist consider the case of acceptable elementary operations from Def. 3.4 As 



they involve guards and assignments only, not synchronization primitives, and because 



62 ANTOINE MINE 



Sc[[X -^ e, tj and Sc[[e M 0?, t} never change the partitioning, the proof of (i) — (iii) for 



all elementary transformations is identical to that of proof A. 7 

We now prove that, if (i)-(iii) hold for a pair p, p' , then they also hold for a pair p-s, p' -s 
for any statement s. The case where s is an assignment or guard is also identical to that 



of proof A. 7, for the same reason. We now consider the case of synchronization primitives. 
(i) holds because synchronization primitives do not change the set of errors. The proof 
that (ii) holds is similar to the case of an assignment. Indeed, any interference added by 
s after p' has the form {t,c,X, p'{X)) for some state (c, p') G R{f\clp' ,t}{R,Cl, I)). Due 
to (iii), there is some {c, p) £ R{l]clP,t}iR,^,I)) where, either X £ Vf or p{X) = p'{X). 
When X ^ V/, we note that any (t, c, AT, p'(X)) added by s after p' is also added by s 
after p. Thus, the extra interferences in p' do not violate (ii). For the proof of (iii), 
take (c,/9') G R{l]c\p' ■ s, ti{R,Q.J)). There exists some {c, p'^) £ R{^cIp' ,t}iR,^, I)) 
where, for all X £ V, either p'{X) = p'^{X), or there is a well synchronized interference 
{t' , c' , X , p' (X)) with t / t' . By applying (ii) to the pair p,p', all these interferences are 
also in /(FlcKp, t]](-R, 0,, I)). Thus, by applying (iii) to the pair p,p' , we get a state (c, pi) £ 
R{f\clp,t}{R, r? ,/)) that also satisfies (iii) for the pair p ■ s, p' ■ s. 

As in proof [AI7 the fact that, for any p,p',p",q, if the pairs p,p' and p',p" satisfy 



(i)-(iii), then so do the pair p,p" and the pair q ■ p, q ■ p' is straightforward. This completes 
the proof that elementary path transformations can be applied in sequence and applied in 
a context containing any primitive statements (even synchronization ones) before and after 
the transformed part. 



The end of the proof is identical to that of proof A. 7 We compare the fixpoints Ifp F 
and Ifp F' that compute respectively the semantics of the original program Sd body^, tj = 
f\clTT{bodyf), tj and the transformed program inc[[vr'(t), t}. Then, (i) and (ii) imply that 
F'{X) Qc ^{^)j aiid so, Ifp F' Qc Ifp Fi except for interferences on local or free variables. In 
particular, ^{Ifp F') C ^{Ifp F). The interference semantics of the original program contains 
all the errors that can occur in any program obtained by acceptable thread transformations. 
As a consequence, P^ ^ Pc- D 



A. 9. Proof of Theorem 



4.2, Pc C P»,. 



Proof. We remark that Pc and Pj, and so, P^ and Pj, are similar. In particular, the 
definitions for non-primitive statements and the fixpoint computation of interferences have 
the exact same structure. Hence, the proof \K^ applies directly to prove the soundness 



of non-primitive statements as a consequence of the soundness of primitive statements. 
Moreover, for assignments and tests, the proof of soundness is identical to the proof A.6[ 



but componentwise for each c £ C. To prove the soundness of synchronization statements, 
we first observe the soundness of irv and out^ in that: \lt£T,l,uQM.,m£M.,V^ £ 

S^, Jtt Gjfl : V/jE7f(y«) : 

m(t, /, u, m, p, 7x(/")) ^ ls{in\t, I, u, m, VKl^)) 
out{t, I, u, m, p, 7x(/*)) <Z lx{out\t, /, u, m, V\ /«)) . 

Secondly, we observe that S^ first reorganizes (without loss of information) the sets of 
environment tuples R <^ C x £ and interference tuples /CT^xCxVxP appearing in Sc 
as functions R' = Xc : {p \ {c, p) £ R} and I' = Xt,c,X:{v\ {t,c,X,v) £ I}- Then, 
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it applies an abstraction in, respectively, £*' and AA" by replacing the set union U by its 
abstract counterparts U^. and Uj. D 
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